Closed kwilcox closed 2 years ago
I don't believe that CF says anything about masking the data using valid_min
or valid_max
it does say that values outside those range should be treated like missing values. (CF §2.5.1)
This is wrong in all cases, and we'll bring it up at our weekly discussion this week. Thanks!
Any progress on this? There are multiple offending dataset and my list of workarounds is ever-growing. UW130-20160523T1828
is incorrect as well.
@kwilcox, the dataset folders in question have valid_min set to -90.0 and valid_max set to 90.0.
In folders for UCSC260-20150520T0000
and UW130-20160523T1828
, this is the case for all of the NetCDF files submitted. Below shows the results for UW130-20160523T1828
, but it's also true for UCSC260-20150520T0000
:
(gliderdac)➜ UW130-20160523T1828 sort -u <(for f in *.nc; do
ncdump -h "$f" | grep -E '\<profile_lon:valid_(min|max)'
done)
profile_lon:valid_max = 90. ;
profile_lon:valid_min = -90. ;
Note here that profile_lon
and profile_lat
get renamed to longitude
and latitude
respectively.
We could hardcode the longitude in valid_min/max to [-180, 180] in the generated ERDDAP datasets.xml. This may help, but I don't know if some providers are intentionally using different bounds. There are tradeoffs to any method. @kerfoot, any input here?
@kwilcox, feel free to weigh in here.
I'm positive that nobody intended to have a valid_min
set to -90.0 and valid_max
set to 90.0 for longitude
. It is an error in the original data submitted. I'd also like the DAC to expose the raw submitted files in the future so fixing the issue there would be most preferable.
@kwilcox, At this point, we are not enforcing the standards we have set up. We are providing them as an opportunity for standardization. However to encourage participation in the DAC and enable providers we do not require compliance for data to be accepted. We provide some QC testing if it is not done by the provider and offer the compliance checker to providers to identify issues. However the issues you identified here and in issue #117, do not prohibit the provider from submission. Eventually when enough users are interested in the data and we have enough providers to enforce compliance, we will, but we are not there yet. The data is available in its aggregated format and you are welcome to flag and not use data with errors or speak directly to providers if you would like to correct their data for your own use.
Offending data sets appear to be fixed. If not, please notify us and we will take a look at this again. closing.
I've come across a few datasets with bad
valid_min
andvalid_max
attributes on thelongitude
variable. One such example isUCSC260-20150520T0000
. If following CF anything outside of thevalid_min
andvalid_max
should be masked... causing chaos on this dataset.valid_min: -90
andvalid_max: 90
valid_min: -90
andvalid_max: 90
Probably worth mentioning that the
precise_lon
variable is correct withvalid_min: -180
andvalid_max: 180