phenopackets / phenopacket-format

26 stars 10 forks source link

Rename 'type' #78

Open mcourtot opened 7 years ago

mcourtot commented 7 years ago

While I can see why 'type' was chosen as label based on owl:Type, when looking at phenopackets (for example here) it seems very generic. Would you consider renaming it? In GA4GH, we use OntologyTerm which I believe covers the same object.

balhoff commented 7 years ago

In addition to types there is negated_types (example here), where the subject would have the type not X for any X in negated_types. I think there is also the implication that the subject is in the intersection of all the listed types, and not just somehow associated with those terms. I do agree that something like OntologyTerm might be more meaningful for people who aren't really into OWL. Do you have anything comparable for negated types?

mbaudis commented 7 years ago

@balhoff At a first glance, negated may be better served as an attribute of the OntologyTerm object. Also, not the first request for this. Would also be in line with a concept where you can have quanti/qualifiers as part of the object describing the feature.

So IMO it is semantically worse to have

versus only a list of

cmungall commented 7 years ago

@mbaudis - sorry, I don't understand your notation

mbaudis commented 7 years ago

@cmungall Sorry, trying comments on the phone...

So, to go with the example linked by @balhoff : I'd prefer

      types:
        - id: HP:0001711
          label: Abnormality of the left ventricle
        - id: HP:0001707 
          label: Abnormality of the right ventricle
        - id: HP:0001714
          label: Ventricular hypertrophy
          quality: excluded

... to the separate negated_types: []. Expression could be different (FALSE | ∃ | ∄ | quantity==0); that is a matter of documentation, including the default treatment of negated objects.

But then: this does not address the original request by @mcourtot to relabel type(s) (which is +1 from my side).

cmungall commented 7 years ago

OK, I see now, thanks.

Making it a property of the term object doesn't work in the context of phenopackets. If we had an object

id: HP:0001714
label: Ventricular hypertrophy
quality: excluded

Then quality: excluded is a property of HP:0001714 in all contexts.

Another option would be to make negation a property of the association

mbaudis commented 7 years ago

Another option would be to make negation a property of the association

@cmungall Yes; I had this thought, too; seems natural. But I haven't really worked myself into thinking through this topic yet (evidence, association...).

mcourtot commented 7 years ago

Sorry I don't quite get this - in the example linked by @balhoff here, isn't the negation a property of the association phenotype_profile between entity 1 and its phenotype?

cmungall commented 7 years ago

My comment was on this example:

        - id: HP:0001714
          label: Ventricular hypertrophy
          quality: excluded
mcourtot commented 7 years ago

Yes, I think we are all talking about the same example - @mbaudis had omitted the beginning. The full example is:

persons:
  - id: '#1'
    date_of_birth: 1999-01-01
    sex: M
  - id: '#2'
    sex: M
  - id: '#3'
    sex: M
phenotype_profile:
  - entity: '#1'
    phenotype:
      description: Bilateral ventricle anomalies (but not hypertrophy)
      types:
        - id: HP:0001711
          label: Abnormality of the left ventricle
        - id: HP:0001707 
          label: Abnormality of the right ventricle
      negated_types:
        - id: HP:0001714
          label: Ventricular hypertrophy

and @mbaudis was suggesting to instead have

persons:
  - id: '#1'
    date_of_birth: 1999-01-01
    sex: M
  - id: '#2'
    sex: M
  - id: '#3'
    sex: M
phenotype_profile:
  - entity: '#1'
    phenotype:
      description: Bilateral ventricle anomalies (but not hypertrophy)
      types:
        - id: HP:0001711
          label: Abnormality of the left ventricle
        - id: HP:0001707 
          label: Abnormality of the right ventricle
        - id: HP:0001714
          label: Ventricular hypertrophy
          quality: excluded

It seems that both are pretty much equivalent, and the excluded would pertain to the association in this case. I thought the choice to keep negated type separated was to allow for faster search when looking for them, as you wouldn't need to browse the full ontology terms list, then look up their quality/qualifier. But if this is not an issue then maybe having the one list would make sense?

mbaudis commented 7 years ago

It seems that both are pretty much equivalent, and the excluded would pertain to the association in this case. I thought the choice to keep negated type separated was to allow for faster search when looking for them, as you wouldn't need to browse the full ontology terms list, then look up their quality/qualifier. But if this is not an issue then maybe having the one list would make sense?

I don't know how the search of a separate attribute would speed things up (definitely not query wise). A qualifier would be more flexible (could be the same as for evidence or quantity).

cmungall commented 7 years ago

It seems that both are pretty much equivalent

Only with a very awkward mapping, as in the latter the quality is part of the term object.

mcourtot commented 7 years ago

Ok this is helpful, as I don't see the 'awkward mapping' step. Do you mean that because you have an ontologyTerm/type object which doesn't include a qualifier attribute you would have to split the result out into

id: HP:0001714
label: Ventricular hypertrophy

(mapping directly to an ontologyTerm/type) and quality: excluded ?

Wouldn't it be possible to extend the ontologyTerm/type object to include the qualifier attribute? Not trying to pull hair, just trying to fully understand what are the issues as those may also pertain to the GA4GH schema.

mbaudis commented 7 years ago

I think @cmungall means the modification of the object structure (which may introduce its own problems). There is in fact another option - you can have a wrapper for each term object:

 - entity: '#1'
    phenotype:
      description: Bilateral ventricle anomalies (but not hypertrophy)
      types:
       - 
        evidence: observed
        term:
          id: HP:0001711
          label: Abnormality of the left ventricle
       -
        evidence: observed
        term:
          id: HP:0001707 
          label: Abnormality of the right ventricle
       -
        evidence: excluded
        term:
          id: HP:0001714
          label: Ventricular hypertrophy

I use the attribute names here only as examples.

mbaudis commented 7 years ago

The structure above would be suited well for e.g. the concept of wrapping a single disease/phenotype instance, e.g. for description of a biosample. Also, additional quantifiers/qualifiers could be used (quality: severe ...?), which may be related to the single terms/types, not to the whole phenotype (e.g. the hypertrophy could be severe/mild/..., not the overall phenotype).

But this may be solved elsewhere in the schema? Here mostly talking with my GA4GH/metadata hat...