ioos / registry

Getting data services registered in the IOOS Service Registry
http://ioos.github.io/registry/
2 stars 7 forks source link

Harvest West Florida FVCOM aggregation catalog instead of granules #17

Closed rsignell-usgs closed 10 years ago

rsignell-usgs commented 10 years ago

While testing the system-test inundation ipython notebook for the west florida shelf, I discovered that the West Florida Shelf FVCOM data were not appearing. Inspecting the NGDC Collection Source List for THREDDS-SECOORA: https://www.ngdc.noaa.gov/docucomp/collectionSource/list?recordSetId=2604653&componentId=&serviceType=THREDDS&serviceStatus=&serviceUrl=&search=List+Collection+Sources I see that the THREDDS catalog being harvested for this data is http://crow.marine.usf.edu:8080/thredds/catalog/WFS_FVCOM_NF_model/catalog.xml and if you visit the HTML equivalent http://crow.marine.usf.edu:8080/thredds/catalog/WFS_FVCOM_NF_model/catalog.html you can see that this is the granule level catalog.

We want to instead harvest the aggregated WFS FVCOM dataset here: http://crow.marine.usf.edu:8080/thredds/catalog.html?dataset=FVCOM-Nowcast-Agg but unfortunately this is located in the top level THREDDS catalog http://crow.marine.usf.edu:8080/thredds/catalog.html so there is no way to point to it.

@jcothran, can you please rework your top level catalog so that it points to other catalogs so that catalogs of aggregated data could be individually harvested?

Another way to solve this problem would be to produce your own WAF of ISO metadata (using ncISO) and then have NGDC harvest that. That takes some of the computational load off NGDC and gives you more control over what gets harvestd, but obviously also takes more work up front, and then monitoring the WAF to make sure it's correct and up-to-date.

jcothran commented 10 years ago

Thanks Rich for the catch and suggestions, will work on that this week. Jeremy

On Tue, May 13, 2014 at 7:35 AM, Rich Signell notifications@github.comwrote:

While testing the system-test inundation ipython notebook for the west florida shelf, I discovered that the West Florida Shelf FVCOM data were not appearing. Inspecting the NGDC Collection Source List for THREDDS-SECOORA:

https://www.ngdc.noaa.gov/docucomp/collectionSource/list?recordSetId=2604653&componentId=&serviceType=THREDDS&serviceStatus=&serviceUrl=&search=List+Collection+Sources I see that the THREDDS catalog being harvested for this data is

http://crow.marine.usf.edu:8080/thredds/catalog/WFS_FVCOM_NF_model/catalog.xml and if you visit the HTML equivalent

http://crow.marine.usf.edu:8080/thredds/catalog/WFS_FVCOM_NF_model/catalog.xml you can see that this is the granule level catalog.

We want to instead harvest the aggregated WFS FVCOM dataset here:

http://crow.marine.usf.edu:8080/thredds/catalog.html?dataset=FVCOM-Nowcast-Agg but unfortunately this is located in the top level THREDDS catalog http://crow.marine.usf.edu:8080/thredds/catalog.html so there is no way to point to it.

@jcothran https://github.com/jcothran, can you please rework your top level catalog so that it points to other catalogs so that catalogs of aggregated data could be individually harvested?

Another way to solve this problem would be to produce your own WAF of ISO metadata (using ncISO) and then have NGDC harvest that. That takes some of the computational load off NGDC and gives you more control over what gets harvestd, but obviously also takes more work up front, and then monitoring the WAF to make sure it's correct and up-to-date.

— Reply to this email directly or view it on GitHubhttps://github.com/ioos/registry/issues/17 .

jcothran commented 10 years ago

Hey Rich,

Just a catalog update that I've updated the earlier SECOORA thredds services endpoints to ISOWAF types which should be easier to manage from my end and address your earlier issues with the catalog endpoints not pointing correctly to the aggregation (USF FVCOM for example).

https://ngdc.noaa.gov/docucomp/collectionSource/list?recordSetId=2604653&componentId=&serviceType=ISOWAF&serviceStatus=APPROVED&serviceUrl=&search=List+Collection+Sources

These were generated using ncISO against the thredds catalog endpoints(thanks for your earlier youtube video on this also).

You should find here with models the usual water temp, salinity, ocean height, currents for NCSU_MEAS, USF and UF. USF also has listed their SWAN and FVCOM products and gridded hfradar(both WERA and CODAR) for the Tampa area.

With the in-situ you should find the gridded hfradar currents for Longbay, Savannah and Miami and ncSOS bases services for non-federal platforms.

Any questions or comments, let me know. Thanks Jeremy

rsignell-usgs commented 10 years ago

@jcothran , just as test, I tried harvesting the WAF with the FVCOM data on my geoportal, and it's looking good! Minor suggestion: the title and summary global attributes might be updated a bit to provide slightly more descriptive info: 5-20-2014 12-40-34 pm

rsignell-usgs commented 10 years ago

Fixed. Proof here from NGDC geoportal: 5-22-2014 11-24-14 am

rsignell-usgs commented 10 years ago

@jcothran , looks like WFS cruft is still in the registry: http://www.ngdc.noaa.gov/docucomp/page?xml=NOAA/IOOS/SECOORA/iso/reports/IsoValidationReport.xml&view=isoValidationErrorsReport&custom=default&title=NOAA/IOOS/SECOORA%20Invalid%20Records Do you agree?

jcothran commented 10 years ago

@rsignell-usgs I agree that those are invalid records, what I am confused by is that I think they are related to the 'REMOVED' and not 'APPROVED' collection table SECOORA listing below. Is this something where the reporting functions are not tracking service status changes?

https://www.ngdc.noaa.gov/docucomp/collectionSource/list?max=10&serviceStatus=APPROVED&recordSetId=2604653&search=List+Collection+Sources&layout=fluid&serviceType=&offset=0&serviceUrl=&componentId=

amilan17 commented 10 years ago

I can perform a clean up and see if that makes a difference. Please check back again tomorrow.

Anna ~~~~~~~ Anna.Milan@noaa.gov, 303-497-5099 NOAA/NESDIS/NGDC

http://www.ngdc.noaa.gov/metadata/emma ~~~~~~~

On Tue, Jun 24, 2014 at 2:44 PM, Jeremy Cothran notifications@github.com wrote:

@rsignell-usgs https://github.com/rsignell-usgs I agree that those are invalid records, what I am confused by is that I think they are related to the 'REMOVED' and not 'APPROVED' collection table SECOORA listing below. Is this something where the reporting functions are not tracking service status changes?

https://www.ngdc.noaa.gov/docucomp/collectionSource/list?max=10&serviceStatus=APPROVED&recordSetId=2604653&search=List+Collection+Sources&layout=fluid&serviceType=&offset=0&serviceUrl=&componentId=

— Reply to this email directly or view it on GitHub https://github.com/ioos/registry/issues/17#issuecomment-47028209.