Closed jhamman closed 5 years ago
Title: Landsat Analysis Ready Data (ARD) Dataset owner: USGS Subset: All (currently covers just US) Size: 10 Tb ? Data format: Cloud-Optimized Geotiff URL: https://www.usgs.gov/land-resources/nli/landsat/us-landsat-analysis-ready-data
Title: GLOBAL OCEAN GRIDDED L4 SEA SURFACE HEIGHTS AND DERIVED VARIABLES REPROCESSED Dataset owner: Copernicus Marine Environment Monitoring Service Size: ~500 GB Data format: netCDF4 (we also have a zarr copy in pangeo) URL: http://marine.copernicus.eu/services-portfolio/access-to-products/?option=com_csw&view=details&product_id=SEALEVEL_GLO_PHY_L4_REP_OBSERVATIONS_008_047
Title: GHRSST Level 4 G1SST Global Foundation Sea Surface Temperature Analysis Dataset owner: NASA Subset: All Size: ? Data format: netCDF URL: https://podaac.jpl.nasa.gov/dataset/JPL_OUROCEAN-L4UHfnd-GLOB-G1SST
Title: Optimum Interpolation Sea Surface Temperature (OISST) Dataset owner: NOAA NCEI Subset: All, both AVHRR-Only and AVHRR+AMSR Size: 100 GB? Data format: netCDF URL: https://www.ncdc.noaa.gov/oisst
(Note: there are an overwhelming number of SST products available. I am a professor of physical oceanography, and I have no idea which one is the "best"; there are tradeoffs involved. The most valuable ones to have in the cloud are the BIG datasets, like the ones listed here.)
Title: NOAA Climate Data Record (CDR) of Cloud Properties from AVHRR Pathfinder Atmospheres - Extended (PATMOS-x), Version 5.3 Owner: NOAA NCEI Subset: All Size: ? Data format: netCDF URL: https://data.nodc.noaa.gov/cgi-bin/iso?id=gov.noaa.ncdc:C00840
Title: NOAA Blended Sea Winds Owner: NOAA NCEI Subset: 6-hourly & daily Size: ? Data format: netCDF URL: https://www.ncdc.noaa.gov/data-access/marineocean-data/blended-global/blended-sea-winds
More soon.
Is the scope here just satellite data, or can it be any "earth observations"? ARGO data?
@jhamman and @scottyhq, is the Data format:
field supposed to be the existing format, or the desired format on the Cloud?
@rabernat - EO generally so ARGO would be fine. @rsignell-usgs - native format. The recommendation of cloud optimized formats can happen in a separate thread.
I defer to @jhamman who is overseeing this database, but I think there would be value in a few additional fields:
Type: (satellite, model, other)
Current Format: (hdf5, tif, etc.)
Desired Format: (zarr, cloud-optimized geotiff, etc.)
Jon & I have been talking about this. Right now, most helpful is to recommend data that resides on at a NASA DAAC. I've asked them for user stats (both ftp & opendap) which gives some guidance for us to start with. But.. this is forum is useful to help us rank them.
@rabernat Winds -- this is a better produce. http://www.remss.com/measurements/ccmp/ but access is currently only through ftp & documentation is poor. but it is the 4dvar method, which produces much better winds than the gaussian interp used by the NOAA product. SST - yes, I totally agree - focus on BIG data. MUR SST is a great one from NASA. Also the VIIRS for even higher resolution. we have both these on the list already.
-chelle
Title: SMAP Enhanced L3 Radiometer Global Daily 9 km EASE-Grid Soil Moisture, Version 2 Dataset owner: NASA Subset: All Size: < 1 TB Data format: HDF5 URL: https://nsidc.org/data/SPL3SMP_E/versions/2
Title: MODIS-Aqua Ocean Color (Level 2) Dataset owner: NASA Subset: all Size: 10TB Current Data format: NetCDF Desired Cloud Data format: Zarr or GeoTIFF URL: https://oceandata.sci.gsfc.nasa.gov/MODIS-Aqua/L2/**/*OC.nc
Note: GoLIVE is hosted by NSIDC but not officially part of the CMR/DAAC... I believe. A follow-on expansion called ITSLIVE should more properly be under the CMR umbrella; in progress.
Title: GoLIVE Land-ice velocity derived from LANDSAT-8 Dataset owner: NASA? Subset: all Size: unknown Current Data format: NetCDF Desired Cloud Data format: Zarr URL: https://nsidc.org/data/golive
Title: Shuttle Radar Topography Mission (STRM) v4 Dataset owner: Consortium for Spatial Information (CGIAR-CSI) / NASA Subset: all (90m, 250m resampled) Size: 0.2 TB Data format: GeoTiff, ESRI ASCII URL: http://srtm.csi.cgiar.org/
Title: ArcticDEM and REMA 2-m DEM strips Dataset owner: Polar Geospatial Center (UMN) Subset: all (260714 ArcticDEM strips, 187585 REMA strips) Size: ~250 TB Data format: Float32 GeoTiff URL: https://www.pgc.umn.edu/data/arcticdem/ https://www.pgc.umn.edu/data/rema/
Title: HiMAT 8-m along-track and cross-track DEM strips Dataset owner: NASA Subset: all ~5K strips (v2 forthcoming with additional ~1.7K strips) Size: ~1-3 TB Data format: Float32 GeoTiff URL: https://nsidc.org/data/HMA_DEM8m_AT/versions/1 https://nsidc.org/data/HMA_DEM8m_CT/versions/1
Title: GDP hourly drifter positions Dataset owner: Global Drifter Program Subset: all Size: approx. 10 GB Current Data format: NetCDF, ASCII, mat Desired Cloud Data format: Zarr URL: https://www.aoml.noaa.gov/phod/gdp/hourly_data.php
Title: NSIDC Sea-Ice Concentration Dataset owner: NSIDC Subset: all Size: approx. 100 GB Current Data format: NetCDF Desired Cloud Data format: Zarr URL: https://nsidc.org/data/G02202/versions/3
Title: GDP Drifter Climatoloty Dataset owner: Global Drifter Project Subset: all Size: approx. 5 GB Current Data format: NetCDF Desired Cloud Data format: Zarr URL: https://www.aoml.noaa.gov/phod/gdp/mean_velocity.php
Title: TMI SST Dataset owner: Remote Sensing Systems / PO.DAAC Subset: all Size: approx. 50 GB Current Data format: Custom Binary (?) Desired Cloud Data format: Zarr URL: http://www.remss.com/missions/tmi/ and https://podaac.jpl.nasa.gov/
Title: SRTM15+ and SRTM30+ global bathimetry Dataset owner: UCSD? Subset: all Size: approx. 10 GB + 2 GB Current Data format: NetCDF Desired Cloud Data format: Zarr URL: https://topex.ucsd.edu/WWW_html/srtm15_plus.html and https://topex.ucsd.edu/WWW_html/srtm30_plus.html
Title: TROPFLUX Dataset owner: Indian National Centre for Ocean Information Services Subset: all Size: approx. approx. 50 GB Current Data format: NetCDF Desired Cloud Data format: Zarr URL: https://incois.gov.in/tropflux/
Title: Argo Mixed Layers Dataset owner: UCSD Subset: all Size: approx. 1 GB Current Data format: NetCDF Desired Cloud Data format: Zarr URL: http://mixedlayer.ucsd.edu/
Title: WORLD OCEAN ATLAS 2013 version 2 Dataset owner: NOAA? Subset: all (1.00 deg and 5.00 deg) Size: approx. 200 GB Current Data format: NetCDF Desired Cloud Data format: Zarr URL: https://www.nodc.noaa.gov/OC5/woa13/
Title: HadISST Sea-Surface Temperature and Ice Coverage Dataset owner: UK Metoffice Subset: all Size: approx. 4 GB Current Data format: NetCDF Desired Cloud Data format: Zarr URL: https://www.metoffice.gov.uk/hadobs/hadisst/data/download.html
Title: ETOPO1 Global Relief Model Dataset owner: NOAA? Subset: all Size: approx. 4 GB Current Data format: NetCDF Desired Cloud Data format: Zarr URL: https://www.ngdc.noaa.gov/mgg/global/
About World Ocean Atlas, a 2018 version is now available: https://www.nodc.noaa.gov/OC5/woa18/
In my experience, low resolution datasets like WOA, HadISST, etc, work fine already over OpenDAP. It's only when you get into the > 10 GB range that OpenDAP starts to struggle and cloud storage becomes advantageous.
I second @scottyhq that Landsat ARD would be really valuable to have on the cloud. Just wanted to mention this cite that indicates the total size of the record is much larger than 10 Tb
"Each day of Sentinel-2 data collection will result in 1.6 TB of imagery, for each satellite, in comparison to 750 GB per day for Landsat-8, 260 GB for Landsat-7, and, for historical reference, 40 GB for Landsat-5 (Wulder et al., 2008)." - Wulder et al. 2015
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This issue has been automatically closed because it had not seen recent activity. The issue can always be reopened at a later date.
I've been recently asked to put together a list of Earth observation datasets that would be broadly useful to the Pangeo community and that are NOT currently available on a public cloud system like S3, GCS, or Blob. I'd like to encourage anyone who has science applications that uses Earth observation data to take a minute and register their thoughts here. I plan to share a collated version of this list with a number of data providers in the coming weeks.
A template for providing feedback here: