Open TomConlin opened 8 years ago
Thanks for pointing these out.
I am going to change the data in the next months. I would like to make sure you have some mechanism to detect when e.g. 'hallmark' is not called like this anymore.
Also, where do these come from:
37 <http://purl.obolibrary.org/obo/HP_0003581>
32 <http://purl.obolibrary.org/obo/HP_0003584>
7 <http://purl.obolibrary.org/obo/HP_0011462>
1 <http://purl.obolibrary.org/obo/HP_0003596>
If you are changing "hallmark" to a different word we are passing along as a literal it should not matter. If you are changing the column hallmark is currently in to a persistent resolvable identifier it would be good to get a heads up and samples sooner than later.
where do these (objects) come from...
taking the last one, the identifier comes from phenotype_annotation.tab
file at:
http://compbio.charite.de/jenkins/job/hpo.annotations/lastStableBuild/artifact/misc/
grep 0003596 phenotype_annotation.tab
OMIM 103200 %103200 ADIPOSIS DOLOROSA;;DERCUM DISEASE HP:0003596 OMIM:103200 IEA C ADIPOSALGIA|ADIPOSE TISSUE RHEUMATISM|ADIPOSIS DOLOROSA|DERCUM'S DISEASE|LIPOMATOSIS DOLOROSA|NEUROLIPOMATOSIS|http://www.orpha.net/consor/cgi-bin/OC_Exp.php?lng=en&Expert=36397 2009.02.17 HPO
OMIM 175800 POROKERATOSIS OF MIBELLI HP:0003596 OMIM:175800 TAS C 2009.02.17 HPO:probinson
OMIM 605543 PARKINSON DISEASE 4, AUTOSOMAL DOMINANT LEWY BODY HP:0003596 OMIM:605543 IEA C 2009.02.17 HPO:probinson
OMIM 606798 BLEPHAROSPASM, BENIGN ESSENTIAL HP:0003596 OMIM:606798 TAS C 2009.02.17 HPO:probinson
OMIM 606889 ALZHEIMER DISEASE 4 HP:0003596 OMIM:606889 TAS C 2012.07.16 HPO:probinson
OMIM 615780 #615780 RETINITIS PIGMENTOSA 69; RP69 HP:0000550 OMIM:615780 TAS HP:0003596 O 2015.07.19 HPO:probinson
the prefix HP:
maps to http://purl.obolibrary.org/obo/HP_
in dipper/curie_map.yaml
https://github.com/monarch-initiative/dipper/blob/master/dipper/curie_map.yaml#L37
dipper assembles a triple asserting that a subject g2p association has predicate "onset" to a HP: term.
I do not see a cmap for hpoa in the docs folder https://github.com/monarch-initiative/dipper/tree/master/docs but there is a generated rendering of the model the ingest produces in http://data.monarchinitiative.org/dot/hpoa.dot
(the dot file is from the previous release as i am still working on the ones for the current release which is what surfaces these sorts of things)
where do these (objects) come from...
I was wondering why they don't have labels as the other have. Do you know that?
I do not see a cmap for hpoa in the docs folder https://github.com/monarch-initiative/dipper/tree/master/docs
Can you give me some background on these cmap files? Is there documentation about cmap and the usage in monarch?
TC> where do these (objects) come from... SK> I was wondering why they don't have labels as the other have. Do you know that?
likely red herring.
they would, I just did not look them up and add them beyond the the top few in the list
cmaps files are how @mbrush introduced me to the monarch semantic models and (currently) serve as my primary reference when I am writing or editing an ingest script.
Although cmaps appear to be adequate for ontologist communication, they have some deficiencies from a development perspective and alternatives are welcome. One suggestion from @cmungall is SHACL https://github.com/monarch-initiative/dipper/issues/265
in general
cmaps are Concept Maps https://en.wikipedia.org/wiki/Concept_map
in specific
cmap is the software needed to view our files http://cmap.ihmc.us/
@TomConlin can we close?
No. The only change ids that coughsomeonecough added yet another pseudo predicate that requires a proper ontological term that no one seems ready to make.
79454 <https://monarchinitiative.org/frequencyOfPhenotype>
135 <https://monarchinitiative.org/has_sex_specificity>
729 <https://monarchinitiative.org/onset>
At least for the original ticket we could consider: frequency of phenotype: http://semanticscience.org/resource/SIO_000900 age of onset: http://purl.obolibrary.org/obo/mondo#has_onset
@mbrush
hpoa.ttl (exclusively) uses this unresolvable @base iri as a predicate 49,148 times:
https://monarchinitiative.org/frequencyOfPhenotype
followed by object literals such as:
there is a term RO_0003306 "contributes to frequency of condition" which may suffice instead.
hpoa.ttl also (exclusively) uses this unresolvable @base iri as a predicate 438 times
https://monarchinitiative.org/onset
in the form:
where the HP: OBJECT terms are
There are nearly 400 terms including th word "onset" in ontobee. perhaps one of them could work here.
the
<OBJECT>
in these cases seem to me to be doing a better job describing the type of relationship than the predicatemaybe
would be closer to the data.