information-artifact-ontology / IAO

information artifact ontology
Creative Commons Attribution 4.0 International
78 stars 25 forks source link

Would the IAO be the appropriate place to insert a "datum status" concept #192

Open Public-Health-Bioinformatics opened 7 years ago

Public-Health-Bioinformatics commented 7 years ago

This is from the European Nucleotide Archive website: "The International Nucleotide Database Collaboration (INSDC) have developed a standardised missing/null value reporting language to be used where a value of an expected format for sample metadata reporting can not be provided." -

datum_status

Would IAO be a good home for these guys? I would actually propose a few changes, e.g. add one more category 'in process' if there was some commitment to deliver the information; as well a "recorded" status would in theory complete these rough metadata states, and a "datum status" concept to position them all under. Right now the GenEpiO ontology has these terms but they are best positioned in a more general ontology.

alanruttenberg commented 7 years ago

Yes. See the term status instances already in IAO (ontology metadata). We might want to talk about the label for the class - datum status is rather broad. From an IAO point of view, status labels are information about what they are the status of.

Public-Health-Bioinformatics commented 7 years ago

Perhaps "datum retrieval state" then? I realize now that reporting that a datum instance is in some state like this would depend on a storage or intermediary system attempting to fetch a datum instance and finding the instance is in one of these metadata states instead.

Public-Health-Bioinformatics commented 7 years ago

I've put up a pull request for your consideration

Public-Health-Bioinformatics commented 7 years ago

About term status as instances ... I get how one connects a class via "has curation status" to an instance like "requires discussion". But I like the idea of keeping data request state a class since it is likely to be further subclassed in the future, e.g. "data obfuscated" could have subclasses "data obfuscated by randomization" and "data obfuscated by scaling" etc. But we could punn these to have owl:NamedInstance ids for those who want.

Either way, I think we can use "has quality" relation between "data item" instances and a "data request state"? Or must we introduce a "has data request state" (domain data item, range data request state)?

alanruttenberg commented 7 years ago

can't use 'has quality' at the moment, since only independent continuants can have qualities, so a different property. There is an ongoing discussion advocating that generically dependent continuants can have dependents themselves, but consensus has not been reached.

I see the desire to be able to subdivide later. I'm not sure about whether I think its a good idea, since it is an easy path to moving things that might be more properly represented elsewhere. For example, a more transparent way to encode obfuscation would be to have a data transformation type and represent that the data is the output of such a process.