ioos / comt_catalog

THREDDS catalogs for comt_catalog.sura.org:/var/www/thredds_instance/content/thredds
http://comt.sura.org/thredds
1 stars 5 forks source link

Creating discoverable, standardized COMT1 obs data #41

Open rsignell-usgs opened 9 years ago

rsignell-usgs commented 9 years ago

Julia Signell @jsignell, with some funding from @rluettich, has some hours to try to do something useful with COMT obs data that move us forward. So what to do?

Meeting with Rick, Julia, it seems that it would be useful to start with the best collection of data, the inundation tropical data.

I've spent most of today studying where we are with OBS data, with help from @brianmckenna

Looking on the old testbed server (testbedapps.sura.org), I see: /data/ftp/upload/Inundation/observations/tropical/ http://testbedapps.sura.org/thredds/catalog/alldata/Inundation/observations/tropical/catalog.html with a readme.txt from Corbitt Kerr: http://testbedapps.sura.org/thredds/fileServer/alldata/Inundation/observations/tropical/readme.txt that explains how the data were cleaned and converted into *.IMEDS ascii files.

So it appears the good, cleaned up observational IMEDS files are in: /data/ftp/upload/Inundation/observations/tropical/2008-Ike/Processed /data/ftp/upload/Inundation/observations/tropical/2005-Rita/Processed

and from Corbitt's readme, it seems that the IMEDS data we want to preserve for the future are:

cd /data/ftp/upload/Inundation/observations/tropical/
find . -name '*.F.C.IMEDS' -print

./2008-Ike/Processed/watlev_TCOON.F.C.IMEDS
./2008-Ike/Processed/watlev_CRMS.F.C.IMEDS
./2008-Ike/Processed/watlev_USGS-PERM.F.C.IMEDS
./2008-Ike/Processed/watlev_NOAA.F.C.IMEDS
./2008-Ike/Processed/watlev_USACE-CHL.F.C.IMEDS
./2008-Ike/Processed/watlev_USGS-DEPL.F.C.IMEDS
./2008-Ike/Processed/watlev_USACE.F.C.IMEDS
./2008-Ike/Processed/watlev_CSI.F.C.IMEDS
./2005-Rita/Processed/watlev_CRMS.F.C.IMEDS
./2005-Rita/Processed/watlev_USGS-PERM.F.C.IMEDS
./2005-Rita/Processed/watlev_NOAA.F.C.IMEDS
./2005-Rita/Processed/watlev_USGS-DEPL.F.C.IMEDS
./2005-Rita/Processed/watlev_CSI.F.C.IMEDS
[root@testbedapps tropical]# find . -name 'tm_*.F.IMEDS' -print
./2008-Ike/Processed/tm_NDBC.F.IMEDS
./2008-Ike/Processed/tm_USACE-CHL.F.IMEDS
./2008-Ike/Processed/tm_CSI.F.IMEDS
./2005-Rita/Processed/tm_NDBC.F.IMEDS
./2005-Rita/Processed/tm_CSI.F.IMEDS

[root@testbedapps tropical]# find . -name 'tp_*.F.IMEDS' -print
./2008-Ike/Processed/tp_USACE-CHL.F.IMEDS
./2008-Ike/Processed/tp_CSI.F.IMEDS
./2008-Ike/Processed/tp_UNDKennedy.F.IMEDS
./2008-Ike/Processed/tp_NDBC.F.IMEDS
./2005-Rita/Processed/tp_CSI.F.IMEDS
./2005-Rita/Processed/tp_NDBC.F.IMEDS

[root@testbedapps tropical]# find . -name 'dir_*.F.IMEDS' -print
./2008-Ike/Processed/dir_CSI.F.IMEDS
./2008-Ike/Processed/dir_NDBC.F.IMEDS
./2005-Rita/Processed/dir_CSI.F.IMEDS
./2005-Rita/Processed/dir_NDBC.F.IMEDS

[root@testbedapps tropical]# find . -name 'hs_*.F.IMEDS' -print
./2008-Ike/Processed/hs_NDBC.F.IMEDS
./2008-Ike/Processed/hs_UNDKennedy.F.IMEDS
./2008-Ike/Processed/hs_USACE-CHL.F.IMEDS
./2008-Ike/Processed/hs_CSI.F.IMEDS
./2005-Rita/Processed/hs_NDBC.F.IMEDS
./2005-Rita/Processed/hs_CSI.F.IMEDS

These files all contain multiple stations per IMEDS file, but to facilitate discovery we would like to split these into individual netcdf files, with one station per file.

There is python code to convert the IMEDS files to NetCDF here: /data/ftp/upload/acrosby/imeds_crawler.py

but we should use this code only for reading the IMEDS data files, and use the code at: https://github.com/axiom-data-science/pyaxiom to write updated, CF-1.6 compliant NetCDF output, with one station, one sensor (aka "device") per file.

To write using pyaxiom, you can just get the data into a Pandas dataframe and you can spit out a netcdf file in the proper format, or you can also build the time series manually: https://github.com/axiom-data-science/pyaxiom/blob/master/pyaxiom/tests/test_timeseries.py#L27-L49

@jsignell, is this enough of a directed data task that you can begin?

Others yell if you have issues with this plan.

Note that we are not going to remove or modify any existing files. These will be brand-new netcdf files.

kknee commented 9 years ago

@rsignell-usgs dealing with the obs data was next on our list after the UI updates for the annual meeting next week. We would also like to move away from the organization used in the IMEDS files to help facilitate the filtering necessary for our new UI. Your plan sounds reasonable, might be good to discuss with everyone in the same room next week.

rsignell-usgs commented 9 years ago

looks like these files are on the comt2 server also, like:

/data/comt_1_archive/inundation_tropical/observations/tropical/2008-Ike/Processed/tm_NDBC.F.IMEDS
/data/comt_1_archive/inundation_tropical/observations/tropical/2005-Rita/Processed/tm_NDBC.F.IMEDS
rsignell-usgs commented 9 years ago

I uploaded the imeds_crawler.py code to here https://github.com/ioos/comt_catalog/blob/master/code/imeds_crawler.py