geneontology / obographs

Basic and Advanced OBO Graphs: specification and reference implementation
63 stars 12 forks source link

Missing property nodes for preds in go basic #48

Open MrCreosote opened 4 years ago

MrCreosote commented 4 years ago

There appear to be some pred definitions in go basic that don't have corresponding property nodes. I believe there were also some val values that were missing property nodes as well that Seth found - IAO_0000277 I think? That could be wrong though.

go-basic was downloaded from obo foundry a few hours ago.

~/SCIENCE/ontology/GO$ ipython
Python 3.6.8 (default, Jan 14 2019, 11:02:34) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.4.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import json                                                             

In [2]: gobasic = json.loads(open('go-basic.json').read())                      

In [3]: g = gobasic['graphs'][0]                                                

In [4]: prop_map = {n['id']: n['lbl'] for n in g['nodes'] if n['type'] == 'PROPE
   ...: RTY'}                                                                   

In [5]: len(prop_map)                                                           
Out[5]: 24

In [6]: # checked all edge preds against prop_map, no issues. No edges have meta
   ...:                                                                         

In [7]: def recurse_obj(obj, preds): 
   ...:     for field in obj: 
   ...:         if field == 'pred': 
   ...:             preds.add(obj['pred']) 
   ...:         elif isinstance(obj[field], list): 
   ...:             for o in obj[field]: 
   ...:                 # no lists of lists in obograph schema 
   ...:                 if isinstance(o, dict): 
   ...:                     recurse_obj(o, preds) 
   ...:         elif isinstance(obj[field], dict): 
   ...:             recurse_obj(obj[field], preds) 
   ...:                                                                         

In [8]: preds = set()                                                           

In [9]: for o in g['nodes']: 
   ...:     recurse_obj(o, preds) 
   ...:                                                                         

In [10]: for p in preds: 
    ...:     print(p, ' -> ', prop_map.get(p)) 
    ...:                                                                        
http://www.geneontology.org/formats/oboInOwl#hasScope  ->  has_scope
http://www.geneontology.org/formats/oboInOwl#hasAlternativeId  ->  has_alternative_id
hasBroadSynonym  ->  None
http://www.geneontology.org/formats/oboInOwl#hasOBONamespace  ->  has_obo_namespace
http://purl.obolibrary.org/obo/IAO_0000231  ->  None
http://www.geneontology.org/formats/oboInOwl#consider  ->  consider
http://www.geneontology.org/formats/oboInOwl#shorthand  ->  shorthand
http://www.geneontology.org/formats/oboInOwl#is_metadata_tag  ->  None
http://www.geneontology.org/formats/oboInOwl#is_class_level  ->  None
http://purl.obolibrary.org/obo/IAO_0100001  ->  term replaced by
hasRelatedSynonym  ->  None
hasExactSynonym  ->  None
hasNarrowSynonym  ->  None

In [11]: # IAO_231 is missing, which seems concerning. Not sure if the missing i
    ...: s_* are as concerning