Closed lsetiawan closed 7 years ago
(Updated 8/29/2017 for clarity and compactness)
I am currently using the mapping in https://github.com/ODM2/WOFpy/issues/152#issuecomment-313770769.
Here's a waterml:odm2
CV mapping dictionary, with items sorted by keys in the order used in the previous comment (above). Another resource I found is here.
wmlodm2_cvmap = {
'censorcode' : 'censorcode',
'datatype' : 'aggregationstatistic',
'samplemedium' : 'medium',
'unitstype' : 'unitstype',
'generalcategory' : 'variabletype',
'valuetype' : 'actiontype'
}
Inventory of ODM2 CV terms that don't currently match WaterML 1.1 CV terms, listed by ODM2 CV. The set of CV's listed here is the same as in the preceding comment. The numbers following the ODM2 CV name are as follows: (Number of terms:Number of terms not matching with WaterML 1.1)
@lsetiawan, thanks for the additional work/research, and for starting a new, narrower issue (and for being diligent and deleting the last comment you posted in #152, to put it here instead). This looks like good progress towards mapping non-matching ODM 2 terms.
I'll be seeing Anthony and Jeff tomorrow, so we can discuss these CV 1.1 vs 2 issues and mappings in person!
@lsetiawan can you remind me where we are with these CV mappings work? I see that we merged PR #163 which lists method, source and qualitycontrollevel in its title. I also see PR #159 (merged), which was broader; and for my own reference, I'll paste here the comments I made there:
These are great steps in the right direction. I really like the caching of the latest, relevant ODM/WaterML 1.1 at the time the wofpy server is started.
What your PR doesn't address yet is the need to "curate" a mapping between ODM 2 and ODM 1.1 vocabulary terms in a way that's not just 1:1. I'm not sure how we reconcile your dynamic, automatic approach (ie, just pull in the latest vocabularies) with the need to develop mappings that are manually maintained.
Also, personally I think I would put all this vocabulary related code in a more focused module (eg, "vocabularies.py") rather than the new generic "util.py" you've created.
And of course, this discussion started at issue #152
We need to work together to fill out the dropdowns on https://github.com/ODM2/WOFpy/issues/160#issue-245239137 above to match ODM2 CV to WATERML CV.
Some comments:
censorcode
and datatype
in the first comment on this issue are clear examples of the former. When the differences for a CV are drastic (the second case), it may be best to just let the ODM2 terms "pass through" w/o evaluation ...vocabularies.py
)It's not a generic mapping that would apply across all DAO's in WOFpy. So, it should probably be implemented at the DAO level.
Yea this makes sense. Though your comment:
ODM2 terms "pass through" w/o evaluation ...
This is a little difficult in DAO level. In core_1_1.py
checks are performed like L641. My thoughts are adding another variable to the check functions that specify the data model somehow, or just let any CV pass through by default if they are not matched in DAO.
Let's discuss this in an hour or so, when I'm in. FYI I've done some refactoring of vocabularies.py
@lsetiawan, will you be able to submit a PR with the remaining changes for CV handling, before you leave? We're so close!
@emiliom The latest changes are now up on WOFpy dev server. Thanks.
Whoa, that was very fast! I've started looking at it. FYI, many of the valueType
values look odd (in postgresql/EnviroDIY test DB), but don't worry about it. I'll look into it.
I'll be up for a few more hours. If you issue the release I can prepare the packages today.
Thanks, @ocefpaf. But I have some questions for @lsetiawan first, and it looks like I may need one more PR.
@lsetiawan (after testing REST 1.1 dev endpoints):
GetVariable*
and GetValues*
requests, but GetSite*
responses have glitches:
GetSites
has an empty seriesCatalog
element. I don't remember if it's supposed to be there at all, or if it should be populated.GetSiteInfo
responses have "Unknown" values for valueType
, dataType
and sampleMedium
, in both databases/endpoints.state
. Did you? I'm not seeing any responses that include the state. Not a problem either way, but I need to know so that the release notes are accurate.GetSiteInfo responses have "Unknown" values for valueType, dataType and sampleMedium, in both databases/endpoints.
Looks to me like the Variables
instantiation at odm2/timeseries/sqlalch_odm2_models.py#L111 needs to look more like the one at odm2/timeseries/odm2_timeseries_dao.py#L205 -- see the extra arguments that are passed.
If you can confirm that that looks right, I'll tackle it.
GetSites has an empty seriesCatalog element. I don't remember if it's supposed to be there at all, or if it should be populated.
I don't know about this one yet. Any thoughts?
It also looks like a datatype
match (self.get_match('datatype' ...
) needs to be added in the get_series_by_sitecode*
functions in odm2_timeseries_dao.py, just as it is in get_variables_from_results
GetSites has an empty
seriesCatalog
element. I don't remember if it's supposed to be there at all, or if it should be populated.
This empty seriesCatalog
element was already present before we made all the recent changes. It's present in the "stable" (non dev) AWS endpoints.
Final DAO configuration for the upcoming release:
censorcode
, datatype
, samplemedium
unitstype
, generalcategory
, valuetype
New release is out. Closing.
152 is getting long. Opening this issue for the mapping of CV. Following up from https://github.com/ODM2/WOFpy/issues/160#issuecomment-317583449. Please edit as you see fit.
The format is waterml cv as key then list of ODM2 cv that matches as values. I am only mapping the ODM2 CV terms that don't match with WaterML 1.1 terms.
Censorcode
## Censorcode [(WaterML 1.1 CV)](http://his.cuahsi.org/mastercvreg/edit_cv11.aspx?tbl=CensorCodeCV&id=773577794) ```yaml censorcode: gt: - Greater than lt: - Less than nc: - Not censored nd: - Non-detect pnq: - Present but not quantified ```Datatype
## Datatype [(WaterML 1.1 CV)](http://his.cuahsi.org/mastercvreg/edit_cv11.aspx?tbl=DataTypeCV&id=789577851) ```yaml datatype: Best Easy Systematic Estimator: - Best easy systematic estimator Constant Over Interval: - Constant over interval StandardDeviation: - Standard deviation ```Samplemedium
## Samplemedium [(WaterML 1.1 CV)](http://his.cuahsi.org/mastercvreg/edit_cv11.aspx?tbl=SampleMediumCV&id=821577965) ```yaml samplemedium: Not Relevant: - Not applicable Other: - Rock - Regolith - Mineral - Ice - Habitat Surface water: - Liquid aqueous - Liquid organic Suspended particulate matter: - Particulate Tissue: - Organism Tree: - Vegetation Wellhead Gas: - Gas ```Unitstype
## Unitstype [(WaterML 1.1 CV)](http://his.cuahsi.org/mastercvreg/edit_cv11.aspx?tbl=Units&id=1125579048)Generalcategory
## Generalcategory [(WaterML 1.1 CV)](http://his.cuahsi.org/mastercvreg/edit_cv11.aspx?tbl=GeneralCategoryCV&id=805577908)Valuetype
## Valuetype [(WaterML 1.1 CV)](http://his.cuahsi.org/mastercvreg/edit_cv11.aspx?tbl=ValueTypeCV&id=1141579105)