climate-mirror / datasets

For tracking data mirroring progress
201 stars 18 forks source link

NASA PODAAC NSCat Data #263

Open nickrsan opened 7 years ago

nickrsan commented 7 years ago

ftp:/podaac.jpl.nasa.gov/allData/nscat.

Suggested in a large email containing many urls

resistor commented 7 years ago

Downloading to local storage now.

Yuri-M-Dias commented 7 years ago

Downloding some of it too, though I'm not sure i'll be able to make it avaliable online soon. Do you guys have a better way of downloading everything, and keeping the mirror? I'm going with the wget method for now, and it's slow to download all of those .snapshots (at least here). I've also tried mounting the ftp locally using the curlftpfs utility and just cping all that I can, but I cannot be sure of the entire file consistency that way.

tkyocum commented 7 years ago

This is ~268GB. I have it all archived; will post S3 URL when the tarball is complete.

tkyocum commented 7 years ago

169GB compressed. Archive available here: https://s3-us-west-1.amazonaws.com/podaac-ftp.jpl.nasa.gov/nscat.tar

bkirkbri commented 7 years ago

@resistor @tkyocum Thanks! If you still have the local tree, would you post find podaac.jpl.nasa.gov/allData/nscat -type f -exec md5sum {} \; | grep -v \\.listing | md5sum

@Yuri-M-Dias If you are going the FTPFS route, you can use rsync instead of cp to be more sure of your copy. It's resumable and when it doesn't do anything, you know you've got the whole tree.

bkirkbri commented 7 years ago

Another private mirror at https://github.com/climate-mirror/datasets/issues/196#issuecomment-275868711

TripleE-0 commented 7 years ago

Here is the result of the MD5sum.

find podaac.jpl.nasa.gov/allData/nscat -type f -exec md5sum {} \; | grep -v \.listing | md5sum 5e5dd6cc73ff11c0a5961f13440c92ed -

find ./podaac-ftp.jpl.nasa.gov/allData/nscat -type f -exec md5sum {} \; | grep -v \.listing | md5sum faf2cd9edb9f88c6407ce0aad678cf82 -