VEuPathDB / EdaLoadingIssues

0 stars 0 forks source link

splitUnitFromValue annotation for variables #38

Closed asizemore closed 1 year ago

asizemore commented 2 years ago

In the Preterm Infant Resistome II (WGS) study, the "Age at discharge", "Maternal age at birth", and "Postmenstrual age at sample collection" should be numbers with days/months/years as the units. For example, see this Age variable.

wbazant commented 2 years ago

This is an issue for all terms where splitUnitFromValue annotation was applied: see the list in https://github.com/VEuPathDB/ApiCommonMetadataRepository/blob/master/ISA/config/ontologyMappingsMicrobiome.xml.

In an overview: the sample detail file has values like "2 hours", and with a "splitUnitFromValue" annotation we remove the " hours" part and record it as a unit. Remarkably, we don't show that the variable is in hours on the live site.

Even more remarkably, it also doesn't seem to show up in ClinEpi. For example, here: Screenshot from 2022-05-09 10-04-03 a "Study timepoint" variable is in months, but I can only guess / infer it from the variable description.

So I think the following plan is needed for this one:

Showing up correctly would, in my opinion, mean that we're going from a variable name like "Delivery duration", and values like "2 hours" in a multipick, to "Delivery duration (hours)" showing nicely, and a numeric variable.

wbazant commented 2 years ago

Bonus: do something appropriate with the other functions, like enforceYesNoForBoolean. Copy the ClinEpi solution.

wbazant commented 2 years ago

next todo on this is probably: find out what clinepi is doing for units, and what plans to support them there are - I think they were a "phase 2" kind of thing

asizemore commented 2 years ago

@jaycolin does clinepi have variables like this that show up as "123 days" instead of the number 123? If yes, does the workflow have a way to split the number from the units?

jaycolin commented 2 years ago

No, we don't load values mixed with units (or if we do they are handled as strings/categorical variables and have no stats)

asizemore commented 2 years ago

ah okay cool. So to make sure i understand - if we wanted these variables to show up as numbers, we'd need to go clean the data (so remove all the "days") before sending it over to you for loading?

jaycolin commented 2 years ago

I see https://github.com/VEuPathDB/ApiCommonMetadataRepository/blob/master/ISA/config/ontologyMappingsMicrobiome.xml has the function to strip units; so that works for the value and you don't have to clean it. But I think the units should be defined in the ontology term (microbiome_human_only.owl) unitsLabel, unitsIRI. Though I don't think it is strictly required.

danicahelb commented 1 year ago

Annotation properties unitLabel & unitIRI need to be filled out for all terms requiring units, on a per-study basis

danicahelb commented 1 year ago

this will be fixed by https://github.com/VEuPathDB/EdaNewIssues/issues/525 closing this ticket