climate-mirror / datasets

For tracking data mirroring progress
201 stars 18 forks source link

Climate.gov (data catalog) #55

Open nickrsan opened 7 years ago

nickrsan commented 7 years ago

Name: Climate.gov (data catalog) Organization: NOAA Description URL: https://www.climate.gov/ Download URL: File Types: NetCDF, CSV, shp, kml,json Size: Status:

elfplease commented 7 years ago

I recursively downloaded the whole climate.gov site to a recursive depth of 8. It gave me 4GB of images and webpages, but no files of the extensions mentioned. It looks like climate.gov links to other sites which actually contain the data, so a more sophisticated crawler would be necessary.

In any case I will keep my offline copy of climate.gov and can put it online later if necessary.

bkirkbri commented 7 years ago

@elfplease Thank you. This is a site that is sure to be altered and having a record of it is important.

gabefair commented 7 years ago

Can we mark this ticket as high priority? @bkirkbri

StephWo commented 7 years ago

The Site just went down:

Climate.gov is temporarily offline for routine maintenance. We apologize for any inconvenience. We anticipate being back online on by Feb. 7

Either we overloadad the server with too many requests or we are too late and they are "maintenancing the science out of it"

@elfplease, can you share? Your data is pre-Inauguration, right?

gabefair commented 7 years ago

Can we mark this ticket as high priority? @nickrsan or @mxplusb

elfplease commented 7 years ago

Hi, sorry for the delay, I actually had this up the whole time. The mirror is here http://climate.seriouself.net/www.climate.gov/ and you can download your own copy from http://climate.seriouself.net/www.climate.gov.tar.gz . It was scraped on Jan 18th, pre-inauguration. Unfortunately the site doesn't display very well right now, I can fix the internal links if there's interest. @gabefair