climate-mirror / datasets

For tracking data mirroring progress
201 stars 18 forks source link

Dataset at http://oceandata.sci.gsfc.nasa.gov/ #83

Open nickrsan opened 7 years ago

nickrsan commented 7 years ago

Name: Organization: NASA Description URL: http://oceandata.sci.gsfc.nasa.gov/ Download URL: File Types: Size: Status:

bkirkbri commented 7 years ago

This dataset is large and very slow to download.

JeremiahCurtis commented 7 years ago

Anything for me to jump on here?

gabefair commented 7 years ago

https://oceandata.sci.gsfc.nasa.gov/search/file_search.cgi


The file search utility can be accessed non-interactively.

The script takes the following options:

usage : returns this message (valid values: 0, 1; default = 0)
sensor: mission name.  
         valid options include: 
         aquarius, seawifs, aqua, terra, meris, octs, czcs, hico, viirs
sdate : start date to limit the search (format YYYY-MM-DD)
edate : end date to limit the search (format YYYY-MM-DD)
psdate : file processing start date for a search (format YYYY-MM-DD)
pedate : file processing end date for a search (format YYYY-MM-DD)
dtype : data type (i.e. level). 
        valid options: 
        L0, L1, L2, L3b (for binned data), L3m (for mapped data), 
        MET (for ancillary data), misc (for sundry products)
addurl: include full url in search result (boolean, 1=yes, 0=no)
results_as_file : return results as a test file listing 
        boolean, 1=yes, 0=no (returns an HTML page)
search: a search pattern string
subID : generate a listing of files that match a non-extracted subscription ID
std_only : restrict results to standard products 
        (i.e. ignore extracts, regional processings, etc.; boolean)
cksum: return a checksum file for search results 
         expects boolean 
         return sha1sum except for Aquarius soil moisture products, which are md5sum 
         forces results_as_file 
         ignores addurl

To use the script non-interactively, you need to craft a URL using the options listed above.

For example, to search for Aqua L1 data matching the pattern "A2009201":

https://oceandata.sci.gsfc.nasa.gov/search/file_search.cgi?search=A2009201*&dtype=L1&sensor=aqua&results_as_file=1&addurl=1&std_only=1

Using the wget utility:

wget -q -e robots=off --wait 1 --post-data="search=A2009201*&dtype=L1&sensor=aqua&results_as_file=1&addurl=1&std_only=1" -O - https://oceandata.sci.gsfc.nasa.gov/search/file_search.cgi |wget -i -

At a minimum, a search pattern OR start date is required. If an end date is defined, a start date must be defined.

gabefair commented 7 years ago

I tried wget -N -m -e robots=off --wait 1 --post-data="sdate=1979-01-01" https://oceandata.sci.gsfc.nasa.gov/search/file_search.cgi but I'm not getting real files back

gabefair commented 7 years ago

A friend has confirmed that this data is already mirrored on a Intelink site. I believe this data is very valuable to the US Navy and has low probably of being deleted or edited. Depending on events, this data might become non-public. So for now I will mark this ticket as low priority.