ioos / ioosngdac

IOOS National Glider Data Assembly Center (V2)
https://ioos.github.io/ioosngdac/
8 stars 18 forks source link

Bad `valid_min` and `valid_max` on longitude #115

Closed kwilcox closed 2 years ago

kwilcox commented 7 years ago

I've come across a few datasets with bad valid_min and valid_max attributes on the longitude variable. One such example is UCSC260-20150520T0000. If following CF anything outside of the valid_min and valid_max should be masked... causing chaos on this dataset.

Probably worth mentioning that the precise_lon variable is correct with valid_min: -180 and valid_max: 180

lukecampbell commented 7 years ago

I don't believe that CF says anything about masking the data using valid_min or valid_max it does say that values outside those range should be treated like missing values. (CF §2.5.1)

This is wrong in all cases, and we'll bring it up at our weekly discussion this week. Thanks!

kwilcox commented 6 years ago

Any progress on this? There are multiple offending dataset and my list of workarounds is ever-growing. UW130-20160523T1828 is incorrect as well.

benjwadams commented 6 years ago

@kwilcox, the dataset folders in question have valid_min set to -90.0 and valid_max set to 90.0.

In folders for UCSC260-20150520T0000 and UW130-20160523T1828, this is the case for all of the NetCDF files submitted. Below shows the results for UW130-20160523T1828, but it's also true for UCSC260-20150520T0000:

(gliderdac)➜  UW130-20160523T1828  sort -u <(for f in *.nc; do                                                        
ncdump -h "$f" | grep -E '\<profile_lon:valid_(min|max)'                                                              
 done)                                                                                                                 
                 profile_lon:valid_max = 90. ;                                                                         
                 profile_lon:valid_min = -90. ;

Note here that profile_lon and profile_lat get renamed to longitude and latitude respectively.

benjwadams commented 6 years ago

We could hardcode the longitude in valid_min/max to [-180, 180] in the generated ERDDAP datasets.xml. This may help, but I don't know if some providers are intentionally using different bounds. There are tradeoffs to any method. @kerfoot, any input here?

benjwadams commented 6 years ago

@kwilcox, feel free to weigh in here.

kwilcox commented 6 years ago

I'm positive that nobody intended to have a valid_min set to -90.0 and valid_max set to 90.0 for longitude. It is an error in the original data submitted. I'd also like the DAC to expose the raw submitted files in the future so fixing the issue there would be most preferable.

BeckyBaltes commented 6 years ago

@kwilcox, At this point, we are not enforcing the standards we have set up. We are providing them as an opportunity for standardization. However to encourage participation in the DAC and enable providers we do not require compliance for data to be accepted. We provide some QC testing if it is not done by the provider and offer the compliance checker to providers to identify issues. However the issues you identified here and in issue #117, do not prohibit the provider from submission. Eventually when enough users are interested in the data and we have enough providers to enforce compliance, we will, but we are not there yet. The data is available in its aggregated format and you are welcome to flag and not use data with errors or speak directly to providers if you would like to correct their data for your own use.

kerfoot commented 2 years ago

Offending data sets appear to be fixed. If not, please notify us and we will take a look at this again. closing.