climate-mirror / datasets

For tracking data mirroring progress
201 stars 18 forks source link

EPA Facility Level GHG emissions #28

Open nickrsan opened 7 years ago

nickrsan commented 7 years ago

Name: EPA Facility Level GHG emissions Organization: EPA Description URL: https://www.epa.gov/ghgreporting/ghg-reporting-program-data-sets Download URL: File Types: Size: Status: Done

meyerzinn commented 7 years ago

This dataset is not going to be mirrored soon. Please advise, we need a more permanent mirror.

nickrsan commented 7 years ago

This shows as having at least one mirror that's hosted on private servers, so that's good, but I'd like to get a public URL for it before considering it mirrored. I'll look back through the form data and see if I can find if they submitted a URL. Thanks for flagging this.

meyerzinn commented 7 years ago

This was the one I submitted. I'm unable to keep it up for more than a few weeks. I will focus on moving it to IPFS.

nickrsan commented 7 years ago

OK, good to know - I'm going to remove the One Mirror flag then so it appears as a higher priority. If nobody else steps in, I'll mirror it.

colinbeier commented 7 years ago

have the full Oracle db (.dmp) as well as .csv of summary data. cannot upload to mirror yet but can confirm there are multiple complete copies archived in safe places.

ghost commented 7 years ago

Excellent. Congrats! Those are important data.

On Wed, Jan 25, 2017, at 21:23, colinbeier wrote:

have the full Oracle db (.dmp) as well as .csv of summary data. cannot upload to mirror yet but can confirm there are multiple complete copies archived in safe places. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub[1], or mute the thread[2].

Links:

  1. https://github.com/climate-mirror/datasets/issues/28#issuecomment-275292969
  2. https://github.com/notifications/unsubscribe-auth/AD3HB3WS55GHW9GGA5i5vDvBDM8LOM7-ks5rWAOogaJpZM4LhVBE
mskallisti commented 7 years ago

Pulling.

dcmccabe commented 7 years ago

Hi all, I'm with an environmental NGO. We use this particular dataset all the time and we'd like to help get the mirrored data set up on a publicly available website. If those who have already pulled the data can contact me, we can start that process. (We are working separately to FOIA the information from EPA, but that could take weeks).

I am at Clean Air Task Force. You won't have any trouble finding my direct email address on the CATF website.

Thanks for all your work!

sasignell commented 7 years ago

When we talk about mirroring, are we talking about mirroring the (handy) GUI too? If so, how can we get the front-end code?

meyerzinn commented 7 years ago

I believe the Internet Archive can preserve GUIs, we are focused on the data (depending on how hard it is to retrieve the GUI).

dcmccabe commented 7 years ago

My partners have a skeletal plan for building a UI (just duplicating what EPA already built, scraping their html etc. to do so.)

There are a number of ways to access the GHGRP data. We may be more focused on the tools that the site uses for downloading all sorts of specific data, rather than the graphic UIs.

So, if you have the data downloaded (or will soon), drop me a line so I can partner you with the folks that are planning to build and host this.

sasignell commented 7 years ago

Yes we have the Oracle dump I believe.

On Thu, Jan 26, 2017 at 12:24 PM, dcmccabe notifications@github.com wrote:

My partners have a skeletal plan for building a UI (just duplicating what EPA already built, scraping their html etc. to do so.)

There are a number of ways to access the GHGRP data. We may be more focused on the tools that the site uses for downloading all sorts of specific data, rather than the graphic UIs.

So, if you have the data downloaded (or will soon), drop me a line so I can partner you with the folks that are planning to build and host this.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/climate-mirror/datasets/issues/28#issuecomment-275451751, or mute the thread https://github.com/notifications/unsubscribe-auth/AD_093b1Oa3Db4o7p5kpcqyFk-gqJ6pmks5rWNbqgaJpZM4LhVBE .

-- Steve Signell Frontier Spatial, L.L.C. 1277 Pembroke Ct. Niskayuna, NY 12309 518-232-1008 frontierspatial.com steve@frontierspatial.com

colinbeier commented 7 years ago

As noted above, I have an archive of the complete GHGRP database direct from the source, current as of 15 Jan 2017, and would be happy to touch base with folks interested in bringing it back online.

Colin Beier SUNY ESF

On Jan 26, 2017 12:25 PM, dcmccabe notifications@github.com wrote:

My partners have a skeletal plan for building a UI (just duplicating what EPA already built, scraping their html etc. to do so.)

There are a number of ways to access the GHGRP data. We may be more focused on the tools that the site uses for downloading all sorts of specific data, rather than the graphic UIs.

So, if you have the data downloaded (or will soon), drop me a line so I can partner you with the folks that are planning to build and host this.

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/climate-mirror/datasets/issues/28#issuecomment-275451751, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AHCIugd-l3d67EsvT9TH4X8OHc1Yihkrks5rWNbrgaJpZM4LhVBE.

BethTrask commented 7 years ago

Colin & all: I'm from Environmental Defense Fund. We'd be happy to stand up a publically accessible site to store these files and make them available for download. As Dave noted earlier today, we also just started talking internally about creating a UI with capabilities similar to EPA's Flight. That will take some time, but we want to start by providing a hosted site to keep these data public. Many thanks to all of you who are saving these vital data!

sasignell commented 7 years ago

I'm happy to help with the database backend and UI... Colin and I worked together on the interactive map page for the New York Climate Change Science Clearinghouse (NYCCSC), which draws from many federal data sets, including GHGRP.

On Thu, Jan 26, 2017 at 1:37 PM, BethTrask notifications@github.com wrote:

Colin & all: I'm from Environmental Defense Fund. We'd be happy to stand up a publically accessible site to store these files and make them available for download. As Dave noted earlier today, we also just started talking internally about creating a UI with capabilities similar to EPA's Flight. That will take some time, but we want to start by providing a hosted site to keep these data public. Many thanks to all of you who are saving these vital data!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/climate-mirror/datasets/issues/28#issuecomment-275472247, or mute the thread https://github.com/notifications/unsubscribe-auth/AD_09-mOW3ROKBSppyaZfw8p0UzC9O8lks5rWOfVgaJpZM4LhVBE .

-- Steve Signell Frontier Spatial, L.L.C. 1277 Pembroke Ct. Niskayuna, NY 12309 518-232-1008 frontierspatial.com steve@frontierspatial.com

meyerzinn commented 7 years ago

I was interested in creating an index in a machine readable format to help your website efforts. I think we can have a mix of machine-readable and human-editable, tell me what you think:

In this repo, we create a folder for datasets that contains a bunch of files, each with its own file that has all the info pertaining to it. We also have a folder with mirrors + indices per mirror. Thus, we could have the website feed from there and also make a bot to check on mirrors.

BethTrask commented 7 years ago

Thank you, Steve and Meyer! I would love to have your help. I can start by getting some the infrastructure lined up with my IT person and then circle back with you to do some brainstorming.

JeremiahCurtis commented 7 years ago

I am assuming that all the subpart-level data have been mirrored in this issue? https://www.epa.gov/enviro/greenhouse-gas-customized-search

colinbeier commented 7 years ago

The complete EPA Clean Air Markets Division (CAMD) GHGRP database (as of 06 Jan 2017), which includes facility-level emissions data, has been archived outside of EPA, but not mirrored yet. Plans are being made to mirror the data if / when it is taken down from EPA’s website. If / when the FLIGHT tool is removed, we hope to recreate it and host it on another (non-fed) site.

FYI - all of the EPA CAMD databases (as of 06 Jan 2017) have similarly been archived, but not yet mirrored. These include LTM, CASTNET, NOx & SOx emissions, emissions trading, GHG, and a few others. Please feel free to email me with any questions.

On Apr 6, 2017, at 10:24 AM, JeremiahCurtis notifications@github.com<mailto:notifications@github.com> wrote:

I am assuming that all the subpart-level data have been mirrored in this issue? https://www.epa.gov/enviro/greenhouse-gas-customized-search

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/climate-mirror/datasets/issues/28#issuecomment-292190528, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AHCIuhlVsvwFfOsOsl0U2L91sSPJV1Qvks5rtPWjgaJpZM4LhVBE.

JeremiahCurtis commented 7 years ago

@colinbeier Thanks for the update. Given that the LTM, CASTNET, and other data have been archived (which I assume means that an offline copy exists), does this mean that the industry and facility-level subpart data have also been mirrored from envirofacts (https://www.epa.gov/enviro/greenhouse-gas-customized-search) , such as the cement production emissions data accessed via https://oaspub.epa.gov/enviro/AD_HOC_TABLE_COLUMN_SELECT_V2.retrieval_list?

Would like to contribute a second mirror for this data, but don't know how to get the subpart-level data outside envirofacts (which appears to require a large number of url queries to get all the GHGRP subpart-level data). Is there a straightforward tool for these data? They don't seem to be on ftp://newftp.epa.gov/ or ftp://ftp.epa.gov/