climate-mirror / datasets

For tracking data mirroring progress
201 stars 18 forks source link

The Global Historical Climatology Network (GHCN) #331

Open gabefair opened 7 years ago

gabefair commented 7 years ago

GHCN Superset

The Global Historical Climatology Network (GHCN) is an integrated database of climate summaries from land surface stations across the globe that have been subjected to a common suite of quality assurance reviews. The data are obtained from more than 20 sources. Some data are more than 175 years old while others are less than an hour old. GHCN is the official archived dataset, and it serves as a replacement product for older NCEI-maintained datasets that are designated for daily temporal resolution (i.e., DSI 3200, DSI 3201, DSI 3202, DSI 3205, DSI 3206, DSI 3208, DSI 3210, etc.).

About this dataset: https://www.ncdc.noaa.gov/data-access/land-based-station-data/land-based-datasets/global-historical-climatology-network-ghcn

Index of ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/

Name Size Ticket where addressed
alaska-temperature-anomalies.txt 11 KB
alaska-temperature-means.txt 4 KB
anom/ 2.441 MB
blended/ 4.885 MB
daily/ 5.329 TB #332
--./daily/all 26.044787 GB #333
--./daily/by_year 14.349867 GB #334
--./daily/ghcnd_all.tar.gz 29.20285 GB
--./daily/ghcnd_gsn.tar.gz 143.310 MB
--./daily/ghcnd_hcn.tar.gz 284.845 MB
--./daily/grid/ghcnd.grid.tar.gz 771.040 MB
--./daily/grid 5.374317 GB
--./daily/gsn 905.501 MB
--./daily/hcn 2.479658 GB
--./daily/superghcnd 5.276572125 TB #336
forts/ 30.045 MB
grid_gpcp_1979-2002.dat 14.604 MB
Lawrimore-ISTI-30Nov11.ppt 3.796 MB
snow 1.485 MB
v1/ 3.544 MB
v2/ 37.226 MB
v3/ 2.914783 GB
v4/ 14.021995 GB

Sizes were computed using lftp du -a --summarize -h command

rustyguts commented 7 years ago

I'm diving into this one. Will update when complete

rustyguts commented 7 years ago

I have one public mirrors available of this data Size: ~5.3TB

https://cinder7.org/datasets/climate-mirror-datasets/331/

sarunasb commented 7 years ago

Mirror at ftp://geom.dartmouth.edu/climatemirror2/ftp.ncdc.noaa.gov/pub/data/ghcn/

Checksums: https://geom.dartmouth.edu/climatemirror/climatemirror2/ftp.ncdc.noaa.gov/pub/data/ghcn/checksums.txt

CorentinB commented 7 years ago

I have one more public mirror available here : (Google Drive Mirror #1) https://archives.corentinb.me/content/climate-mirror-datasets/331.html

gabefair commented 7 years ago

@CorentinB and @RustyGuts I know its been a while since you downloaded it and probably no longer have it on your computer. But we could use a checksum for your data. If a scientist uses your data they would need to publish the data's checksum to make sure other scientist are using the same data. I recommend the following command: hashdeep -erl > ftp_ncdc_noaa_gov_pub_ghcn_hash-audit.txt

rustyguts commented 7 years ago

I can use rclone to grab a MD5, but sha256 is not supported for rclone and the data only exists on Google Drive.

ftp_ncdc_noaa_gov_pub_ghcn_hash-audit.txt