cityenergyproject / dc

Energy benchmarking for the District of Columbia
4 stars 3 forks source link

PID and older data #20

Open emilyhobson opened 8 years ago

emilyhobson commented 8 years ago

Need FK and universal PID from DOEE

emilyhobson commented 8 years ago

Please find updated 2013 and 2014 data attached. https://drive.google.com/drive/u/0/folders/0B5tBIGe_xercT0FfX1dxekxZWjg

I added primary keys, newer entries, and pulled in the lat/long from 2015 where there was a primary key match. However, in many cases, that match did not exist. Because of the way I did the primary keys—PM ID if there, DC Tax ID if not—a building that was out of compliance in 2013/2014 would get a DC… Primary Key, and then in 2015 would get a new PM…. Primary Key, and thus the 2013/2014 data wouldn’t pull its lat/long. So some of the entries in 2013/2014 will need to be geocoded new. Hopefully none of them will need the manual match if more than 100 meters difference thing, but if so, just flag them and we’ll do it like we did before.

I’m working on a Mac at the conference, and can’t figure out how to export pipe-delimitated on a mac, so they are xlsx file

emilyhobson commented 8 years ago

Update: https://drive.google.com/drive/u/0/#my-drive?action=locate&id=0B5tBIGe_xercSi00RUlXMUxBbmJmSnBCTmdpZkNMTm5KbVQw

I realized I didn’t clear out the 0s that are actually nulls from the 2014 data. I have reattached the 2014 data in case you didn’t import it.

Once again, any 0s in any cells are equivalent to nulls, for any field. There is not consistency in when we get a 0 value versus a null value for any of the energy or water metrics reported out of ENERGY STAR Portfolio Manager, so we should treat those as the same and put the 0s as nulls so they are grey on the map.

Also, you asked me to look at outliers. I have done so. Pretty much any EUI over 1000 should be made null in the dataset. I have also fixed this for 2014, again, attached. 2013 is fairly okay on this front. 2015 also has huge outliers. I can resend 2015, but I’m not sure if its worth it since it is already imported.

If you prefer, you can just delete the outliers yourself. Outliers defined as “Weather Normalized Source EUI” > 1000 AND “ENERGY STAR Score” > 5. For those entries, delete (make null) the values in the following columns: ENERGY STAR Score Site EUI (kBtu/ft²) Weather Normalized Site EUI (kBtu/ft²) Source EUI (kBtu/ft²) Weather Normalized Source EUI (kBtu/ft²) Total GHG Emissions (Metric Tons CO2e) Total GHG Emissions Intensity (kgCO2e/ft²) Electricity Use (kWh) Natural Gas Use (therms)