Open mamhoff opened 5 years ago
Here's the output of the script. The task would be to identify whether the units column in the respective data.csv
is unnecessary because the same information could be coded into the unit
or per_unit
column of the corresponding itemdef.csv
file.
Also, a few files have encoding errors.
business/energy/us/subregion/data.csv: ["kg/kWh"]
business/energy/electricity/india/byGrid/data.csv: ["kg/kWh"]
business/energy/electricity/china/byGrid/data.csv: ["kg/kWh"]
business/processes/production/adipicAcid/data.csv: ["kgN2O/kgacid"]
business/processes/production/pulpAndPaper/directEmissions/data.csv: ["kgCO2/kgChemical"]
business/processes/production/ammonia/data.csv: ["none", "%w/w"]
business/processes/production/lime/production/data.csv: ["fraction"]
business/processes/production/lime/carbonate/data.csv: ["fraction"]
business/processes/production/aluminium/soderberg/data.csv: ["kg cyclo/kg Al"]
business/processes/production/aluminium/pfc/slope/data.csv: ["fraction"]
business/processes/production/aluminium/pfc/defaults/data.csv: ["kg/kgAl"]
business/processes/production/aluminium/pfc/overvoltage/data.csv: ["fraction"]
business/processes/production/aluminium/defaults/data.csv: ["kgCO2/kgAl"]
business/processes/production/aluminium/prebake/pitchCooking/data.csv: ["percent"]
business/processes/production/nitricAcid/data.csv: ["kgN20/kgHNO"]
business/processes/production/hcfc22/productionDataOnly/data.csv: ["kgHFC-23/kgHCFC-22"]
business/processes/production/cement/epa/data.csv: ["percent"]
business/processes/production/ironandsteel/ironAndSteel/data.csv: ["kg C/kg material"]
business/processes/production/ironandsteel/coke/data.csv: ["kg C/kg material"]
business/processes/production/ironandsteel/sinter/data.csv: ["kg C/kg material"]
business/buildings/hotel/generic/data.csv: ["kgCO2/night"]
planet/country/uk/average/appliances/data.csv: ["kgCO2/year"]
planet/country/uk/average/travel/data.csv: ["kgCO2/year"]
planet/country/uk/average/home/data.csv: ["kgCO2/year"]
planet/country/uk/aggregate/actonco2/peoplelikeme/appliances/data.csv: ["kgCO2/year"]
planet/country/uk/aggregate/actonco2/peoplelikeme/travel/data.csv: ["kg/year"]
planet/country/uk/aggregate/actonco2/peoplelikeme/home/data.csv: ["kgCO2/year"]
planet/co2Sinks/rainforest/data.csv: ["kg/km^2"]
ERROR in personal/generic/data.csv: invalid byte sequence in UTF-8
transport/taxi/generic/perpassenger/data.csv: ["kgCO2/km.passenger"]
transport/taxi/generic/data.csv: ["kgCO2/km "]
transport/train/generic/data.csv: ["kgCO2/km.passenger"]
transport/minibus/generic/data.csv: ["kgCO2/km"]
transport/plane/generic/data.csv: ["kgCO2/pass.journey ", "kgCO2/pass.km", "N/A"]
transport/plane/generic/freight/defra/data.csv: ["kgCO2e/tkm"]
transport/plane/generic/defra/data.csv: ["kgCO2e/pkm", "kgCO2/pkm", "N/A"]
transport/plane/generic/airports/all/codes/data.csv: ["degrees N and E"]
transport/plane/generic/airports/all/countries/data.csv: ["degrees N and E"]
transport/plane/generic/airports/codes/data.csv: ["degrees N and E"]
transport/plane/generic/airports/countries/data.csv: ["degrees N and E"]
transport/plane/generic/passengerclass/data.csv: ["kgCO2/pass.km"]
transport/other/data.csv: ["kgCO2/km", "kgCO2/launch"]
transport/van/generic/data.csv: ["kgCO2/km "]
transport/car/generic/data.csv: ["kgCO2/km ", "kgCO2/km"]
transport/car/generic/electric/data.csv: ["kWh/km"]
transport/car/bands/ireland/data.csv: ["kgCO2/km "]
transport/ship/generic/data.csv: ["kgCO2/km.passenger"]
transport/ship/generic/freight/data.csv: ["kgCO2/kg.km"]
transport/bus/generic/data.csv: ["kgCO2/km.passenger"]
ERROR in home/water/data.csv: Unquoted fields do not allow \r or \n (line 4).
home/water/defra/data.csv: ["kgCO2/m^3"]
home/water/reductions/data.csv: ["litres/day"]
home/appliances/kitchen/generic/data.csv: ["kWh/year", "kWh/cycle", "N/A"]
home/appliances/energystar/kitchen/refrigerators/data.csv: ["kWh/year"]
home/appliances/energystar/kitchen/freezers/data.csv: ["kWh/year"]
home/appliances/energystar/kitchen/clothesWashers/data.csv: ["kWh/year"]
home/appliances/energystar/kitchen/dishwashers/data.csv: ["kWh/year"]
home/appliances/energystar/office/computers/desktopsAndIntegrated/data.csv: ["kWh/year"]
home/appliances/energystar/office/computers/workstations/data.csv: ["kWh/year"]
home/appliances/energystar/office/computers/notebooksAndTablets/data.csv: ["kWh/year"]
home/appliances/energystar/office/imageEquipment/faxMachines/data.csv: ["kWh/year"]
home/appliances/energystar/office/imageEquipment/printers/data.csv: ["kWh/year"]
home/appliances/energystar/office/imageEquipment/digitalDuplicators/data.csv: ["kWh/year"]
home/appliances/energystar/office/imageEquipment/copiers/data.csv: ["kWh/year"]
home/appliances/energystar/office/imageEquipment/multiFunctionDevices/data.csv: ["kWh/year"]
home/appliances/energystar/entertainment/setTopBoxes/data.csv: ["kWh/year"]
home/appliances/energystar/entertainment/televisionsAndCombinationUnits/data.csv: ["kWh/year"]
home/appliances/cooking/us/data.csv: ["kWh/year"]
home/appliances/cooking/oven/data.csv: ["kWh/year"]
home/appliances/cooking/hob/data.csv: ["kWh/year"]
home/appliances/entertainment/generic/data.csv: ["kWh/year", "N/A"]
home/appliances/televisions/generic/ranges/data.csv: ["kW"]
home/appliances/computers/generic/data.csv: ["kWh/Year", "kWh/year", "N/A"]
home/energy/us/price/data.csv: ["kgCO2/USD"]
home/energy/us/state/data.csv: ["kgCO2/kWh"]
ERROR in home/energy/uk/price/data.csv: invalid byte sequence in UTF-8
home/energy/uk/reductions/data.csv: ["kgCO2", "kgco2"]
home/energy/uk/suppliers/data.csv: ["kgCO2/kWh"]
ERROR in home/energy/electricity/data.csv: invalid byte sequence in UTF-8
home/energy/electricity/realTimeElectricity/fuelEmissionFactors/data.csv: ["kgCO2/kWh"]
home/energy/electricity/realTimeElectricity/data.csv: ["kWh"]
home/energy/insulation/data.csv: ["N/A", "u1"]
home/energy/electricityiso/data.csv: ["kgCO2/kWh"]
home/energy/ireland/suppliers/data.csv: ["kgCO2/kWh"]
home/heating/us/data.csv: ["kWh/year"]
home/heating/uk/renewable/data.csv: ["kWh/year"]
home/heating/uk/floorareas/data.csv: ["metres squared"]
All of our
data.csv
files have aunits
column. However, it's unclear why it is there as theitemdef.csv
files ALSO haveunit
andper_unit
columns, duplicating the information in thedata.csv
file.In most datasets, the
units
column in thedata.csv
file is completely empty. In others, it duplicates the information in the itemdef.csv file.Here's a ruby script that deletes the column if it's entirely unnecessary or shows some information: