Phenomics / ontolib

A modern Java library for working with (biological) ontologies.
https://ontolib.readthedocs.org
Other
9 stars 2 forks source link

HpoDisease design #22

Open pnrobinson opened 7 years ago

pnrobinson commented 7 years ago

The HpoDisease class now has a TermWithFrequency object. However,there are many other modifiers that a term can have including onset and any of the terms from the Clinical Modifier hierarchy. It would be more useful to have TermWithAttributeList Given that not all of the frequency data is an HPO term (i.e., sometimes we have 65% or 13/28), it might make sense to allow a term to have a frequency object as well as a list of modifiers. It would be great to have a discussion about this before actually implementing code since there area number of desiderata and we do not want to have too much duplicate functionality

holtgrewe commented 7 years ago

I agree that this requires discussion.

The idea behind the current implementation is to classify the frequencies in other formats into the HPO term, thus the very rich API of enum HpoFrequency.

What would the attribute list then be?

What about the following?

Some open points:

I think that we can here have an interface TermWithAnnotation and then an implementation HpoTermWithAnnotation that is limited to what the HPO + official annotations provided. Everytime this is extended upstream in the HPO, we can extend the API. In this case, free text annotations are probably not needed here.

Anyone who wants to have their favourite additional annotations can just implement the interface and in the implementation then use their additional favourite fields. For the core stuff, they can just delegate to HpoTermWithAnnotation.

holtgrewe commented 7 years ago

@drseb ping

pnrobinson commented 7 years ago

The Clinical Modifer section of the HPO has lotsof terms that can be used to modifer the terms of the Phenotypic abnormality section. In principle, any cross product modifier x Term can be used for an annotation. For instance, TermX -- triggered by exposure to cold TermY -- severe TermZ --acute TermW --childhood onset

The frequencies are the only such attribute that can be replaced by a numerical value and so they need to have special treatment (I think the current implementation is OK). So we might have these fields in a Term object (in addition to the other items) Term --Freuency<TermId or Double or n/m where n and m are INTs> (Frequency term number, n=0,1) --AttributeList(Any other term in the Clinical Modifier section, n=0,...)

holtgrewe commented 7 years ago

OK, then I would propose:

And HpoTermWithAnnotation is an implementation used in the HpoDisease class.

Then, we can see how far we get with this in our projects and possibly refine in the beginning of next year?

Edit: refine -> possibly refine.

drseb commented 7 years ago

Sounds good. In my implementation I made sure that whatever you use for the frequencies can be mapped to a numerical value. So I would always keep the original String value (e.g. "12 of 30") and the resulting double (i.e. 12/30). Keep in mind that the annotation file (in 90% of the annotations) will have a frequency modifier from the frequency sub ontology. I think I had the mappings to numerical values in my original code. ( @holtgrewe )