ioos / ioosngdac

IOOS National Glider Data Assembly Center (V2)
https://ioos.github.io/ioosngdac/
8 stars 18 forks source link

Three new deployments not showing up on ERDDAP/THREDDS #112

Closed kwilcox closed 7 years ago

kwilcox commented 7 years ago

I started uploading data last Friday and the deployments haven't appeared in ERDDAP or THREDDS as working datasets yet. Documentation says it should take 1-2 hours. Seems like there is an issue?

Please note that modena and ramses are still in the water and uploading new data as it comes in. bass has a deployment issue and is no longer in the water (but I don't want it archived yet so it is not marked as complete).

kerfoot commented 7 years ago

@lukecampbell not sure what the issue is. There are THREDDS .nc files, but they are completely empty and there's nothing on erddap? Any ideas?

lukecampbell commented 7 years ago

I only checked modena.

There's no data in the files. They're not GliderDAC compliant.

kwilcox commented 7 years ago

This was a red herring, I shouldn't have uploaded anything before 201609. I just removed the offending files from the FTP.

The files there now score much better: https://data.ioos.us/compliance/report/f8d7b35e7833104477d8f0b4a233653fd45e98d8

kwilcox commented 7 years ago

ramses score: https://data.ioos.us/compliance/report/2d472099e94d9b50f9c9c58f0915d0152239a02f

lukecampbell commented 7 years ago

I'm gonna restart ERDDAP and manually kick off the processing job and see if I can see anything.

lukecampbell commented 7 years ago

Ok, so this is kind of a guess here, but I think that it's not working because cdm_data_type is defined in the file which contradicts what the datasets.xml file says.

The other deployments on GliderDAC don't define this attribute.

This is the error in the ERDDAP logs:

  accessibleViaNcCF=.ncCF/.ncCFMA isn't available for this dataset because cdm_data_type=Other is not a compatible type.                                                                   

I don't know where the Other comes from, neither the files nor the config specify other.

    <addAttributes>
        <att name="cdm_data_type">trajectoryProfile</att>
        <att name="featureType">trajectoryProfile</att>
        <att name="cdm_trajectory_variables">trajectory,wmo_id</att>
        <att name="cdm_profile_variables">time_uv,lat_uv,lon_uv,u,v,profile_id,time,latitude,longitude</att>
        <att name="subsetVariables">trajectory,wmo_id,time_uv,lat_uv,lon_uv,u,v,profile_id,time,latitude,longitude</att>
lukecampbell commented 7 years ago

Ok, bass and modena still have offending files. As for ramses, it looks good but seeing the error about cdm_data_type makes me think that the attribute is the reason it's not processing.

kwilcox commented 7 years ago

I think this came up recently in another issue somewhere...

From http://www.unidata.ucar.edu/software/thredds/current/netcdf-java/metadata/DataDiscoveryAttConvention.html

cdm_data_type

The "cdm_data_type" attribute gives the THREDDS data type appropriate for this dataset. E.g., "Grid", "Image", "Station", "Trajectory", "Radial". Its use is recommended.

trajectoryProfile isn't a valid value for that attribute... does ERDDAP require it? I believe I'm doing the correct thing in the raw files.

kwilcox commented 7 years ago

Ok, bass and modena still have offending files.

Meaning there are no compliance-checker-like tests run by the DAC to filter out bad real-time NetCDF files and I should implement something like this on my end?

kerfoot commented 7 years ago

@kwilcox the cdm_data_type global attribute is added in the datasets.xml file when the deployment is registered. It helps ERDDAP take the entire series or profiles and aggregate them into a single trajectoryProfile data set. Do the files you're submitting contain this attribute? If so, remove it. If not, we need to keep looking. Can you email me a couple of the files you're writing?

kwilcox commented 7 years ago

Yes, they contain :cdm_data_type = "Trajectory" per ACDD standards.

kerfoot commented 7 years ago

These aren't Trajectories, they are trajectoryProfiles, but that doesn't matter anyways since we add cdm_data_type, and other global attriburtes, on the ERDDAP side. The goal of the file format we defined was to get files that weren't necessarily fully compliant and aggregate them to create fully-compliant aggregations. The files that are submitted by the data providers never see the light of day. All necessary global attributes to make the files compliant are added on the ERDDAP side. I would remove the global:cdm_data_type attribute from your files and resubmit.

kwilcox commented 7 years ago

I will remove the cdm_data_type global attribute when I re-upload all of the NetCDF files after filtering out "offending files" but the strife over this attribute will continue inside of my head!

kwilcox commented 7 years ago

OK I'm now filtering out bad profiles via the compliance checker and all four deployments are now showing up. Thanks for the help guys!

lukecampbell commented 7 years ago

so it was cdm_data_type That's unfortunate. I'm glad we found the issue, but it's unfortunate that we didn't realize this could cause issues until now. We'll be sure to add a blurb in the docs about it.

Other global attributes should work fine. We try to take the approach, "give us all you got, and we'll publish what we're capable of publishing".