Closed mortenwh closed 8 years ago
metadata key name was changed from 'start_date' to 'start_time' in mapper_asar.py in 2c7097a. 'start_date' and 'stop_date' are used in sadcat. If we change to 'start_time' and 'stop_time', we need to change the files in sadcat. Is it necessary to change it?
I think we agreed on time instead of date. Shouldn't be a problem to change sadcat but anton can probably answer on that
metadata key name was changed from 'start_date' to 'start_time' in mapper_asar.py in 2c7097a https://github.com/nansencenter/nansat/commit/2c7097a0c836143d7df9d08bf3a0e766ddc60299 . 'start_date' and 'stop_date' are used in sadcat. If we change to 'start_time' and 'stop_time', we need to change the files in sadcat. Is it necessary to change it?
— Reply to this email directly or view it on GitHub https://github.com/nansencenter/nansat/issues/120#issuecomment-78112258.
We probably should rather use existing conventions. E.g. UNIDATA suggests (http://www.unidata.ucar.edu/software/thredds/current/netcdf-java/metadata/DataDiscoveryAttConvention.html) to use time_coverage_start time_coverage_end time_coverage_duration time_coverage_resolution
The time_coverage_start which is in many netcdf files produced for normap is on the form "YYYY-MM-DDZ" - The "Z" should not really be here as the timezone is irrelevant when we are only using date and not time. And dateutil.parser.parse fails(in mapper_generic) because this string is not an allowed date string.
I don't think the creation of the Nansat object should fail because of this. This particular example file contains a time dimension - one could read this if setting the time fails from reading metadata.
Also - the VRT._set_time method sets the 'time' variable in the metadata for the bands - but it does not say anywhere what this time supposed to be? start time? end time? average time? all times? From the code I can see it fetches it from start-time-ish variables. But it is, as mentioned in #143, "a bit awkward". And the whole time-metadata thing should be reconsidered imo.
n = Nansat("/WebData/normap.nersc.no/arctic12km_seaice/arctic12km_seaice_20100801_20100831.nc")
=>Arctic Sea Ice Concentration<=
Traceback (most recent call last):
File "
We need to have a second look at what metadata is required. We should define which standards to follow and implement them similarly to the wkv.xml
file. Perhaps the well-known-variables and additional standards could be stored in a thesaurus-module, or we choose to use xml files?
I agree. We definitely need standardtization. @mortenwh , what do you mean by thesaurus-module ?
Basically a dictionary that defines the standards used but let's discuss when you're back :)
I agree. We definitely need standardtization. @mortenwh https://github.com/mortenwh , what do you mean by thesaurus-module ?
— Reply to this email directly or view it on GitHub https://github.com/nansencenter/nansat/issues/120#issuecomment-144637920 .
We could validate the metadata before it is added to the vrt dataset by subclassing the gdalDataset (self.dataset) and overriding the SetMetaDataItem function to also validate against the thesaurus.
Something like this:
class NansatDataset(GDALDataset):
__init__(self):
super(NansatDataset, self).__init__()
def SetMetadataItem (self, const char *pszName, const char *pszValue, const char *pszDomain=""):
#PERFORM VALIADATION
nansenmetadata.thesaurus.validate(pszName,pszValue)
return super(NansatDataset, self).SetMetadataItem(pszName, pszValue, pszDomain)
Most of the mappers set time_coverage_start, time_coverage_end, platform, instrument according to the UNIDATA and GCMD standards. end2endtests check if these attributes are in metadata and correspond to nersc-metadata controlled vocabulary. I consider this ticket is closed in 48348d3 Fore more specific issue a new ticket should be created.
Nansat should have some standard methods that return required metadata, like:
Following that, generic tests should be made to make sure this metadata is actually added by the mappers.