climate-mirror / datasets

For tracking data mirroring progress
201 stars 18 forks source link

Large EOSDIS datasets #353

Open JeremiahCurtis opened 7 years ago

JeremiahCurtis commented 7 years ago

http://spacenews.com/trump-administration-planning-to-cut-noaa-weather-satellite-programs/

see issues #68 (LP DAAC) and #178 (LAADS DAAC)

In both issues, it was decided to put the mirroring on hold due to the size of the DAAC data directories (both are measured in PB, not TB). Given the coming budget cuts to NASA, can we reasonably expect these directories to remain open in the near future? We still have a ton of work to do on other open issues, and I don't know if we'll get these directories, complete with no omissions, before it's too late; I don't want to raise unnecessary alarm bells, but simply bring attention to these directories.

Should we attempt a FOIA request for these directories? While it was mentioned on one isssue that NASA scientists might be reluctant to acquiesce to FOIA requests due to the FOIA scandal a few years ago, I would think NASA personnel would be apt to respond favorably to a request for LP DAAC and LAADS DAAC datasets if the request made clear our intent of preserving and safeguarding the data in their entirety, given the uncertainty of data access in the future due to budget cuts, and assuaged any concerns of ill intent. From what I understand, personnel at both NASA and the various EPA offices are finally starting to grasp the magnitude of the current administration's agenda; if we could explain that one should expect the worst given the almost certain collusion between Congress and the White House apropos of eliminating science from the public sphere (enervating the EPA, for instance), we might get some help here. There's no reason to believe that if Tillerson is firing State Dept employees, and Trumpty Dumpty intends to empty the US government of many of its most competent personnel, that NASA would be any different.

http://www.npr.org/2017/03/06/518403153/trump-has-many-jobs-unfilled-is-he-deconstructing-the-administrative-state

A successful FOIA request would eliminate the speed and storage constraints. I can't imagine NASA has a pipe allowing data transfers of several GB per minute, and even at this speed, we would be downloading well into the summer. to finish That's assuming that one or a few machines on the download end were running nonstop with no hiccups, and that the NASA servers would not be overloaded elsewhere in the meantime.....On the other hand, one would think that locally copying the datasets from NASA's servers, as opposed to doing it over the internet, would result in a much faster data transfer

If we were successful in a FOIA request, we'd almost certainly have to defray the costs of the physical drives containing the files themselves or their corresponding tarball'd files, etc. I don't know if a group funding effort would work here, but it might be worth a try, say through gofundme or something of that nature. I know I would be willing to contribute a few hundred dollars toward such an effort; if we could get a decent sized group to contribute whatever they could, I think we could make this work.

Feel free to close this issue if these issues have already been addressed