althonos / pronto

A Python frontend to (Open Biomedical) Ontologies.
https://pronto.readthedocs.io
MIT License
226 stars 47 forks source link

retrieval of part_of children? #36

Open srobb1 opened 4 years ago

srobb1 commented 4 years ago

Hello,

I am new to pronto, and am trying to work out how to retrieve all is_a and part_of relations of a term.

Here is an example term:

[Term]
id: PLANA:0002034
name: ER membrane
def: "The lipid bilayer surrounding the endoplasmic reticulum." []
xref: GO:0005789
is_a: PLANA:0000521 ! bounding membrane of organelle
relationship: part_of PLANA:0007513 ! endoplasmic reticulum

Code that I have tried:

>>> import pronto
>>> from pronto.relationship import Relationship
>>> 
>>> ont = pronto.Ontology('plana.owl')
>>> term = ont['PLANA:0002034']
>>> term.relations
{Relationship('is_a'): [<PLANA:0000521: bounding membrane of organelle>]}

I see that part_of is a known relationship.

>>> for r in Relationship.bottomup():
...   print(r)
...
Relationship('is_a')
Relationship('part_of')
Relationship('develops_from')

How do I include relationship: part_of PLANA:0007513 ! endoplasmic reticulum in my report of the relations of my term?

Thank you, Sofia

althonos commented 4 years ago

Hi! First of all, thanks for using pronto. I am currently in the process of rewriting the library from scratch, and I'm almost done, and I assume the bug will be fixed in the v1.0 release. I'll try to do a pre-release ASAP so that you can have access to what you want here.

srobb1 commented 4 years ago

@althonos Great! This is exciting. Will this bug fix also list part_of related terms with the children and rchildren methods?

Thank you! Sofia

althonos commented 4 years ago

@srobb1 : v1.0.0 is out, care to give it a try?

I reverted to the OBO semantics, so now you can collect all subclasses of a term using the subclasses method on a term; for part_of, you need to use the objects method like so:

>>> plana = pronto.Ontology("path/to/plana.obo")
>>> t = plana["PLANA:0002034"]
>>> for other in t.objects(plana.get_relationship('part_of')):
...     print(other)

This will yield (some of) the terms t that satisfy the triple (PLANA:0002034 . part_of . t). I didn't write the subjects method that does the opposite (t . part_of . PLANA:0002034) yet.

srobb1 commented 4 years ago

Hi @althonos

Let me figure out how to update my pronto and I will certainly test this out.

srobb1 commented 4 years ago

Sorry for the delay. I can focus on this again. I am having issues creating the ontology object. Looks like there is an issue with one of my terms zygotum

>>> plana_path
'/Users/smr/src/ontology/master-obophenotype/20190802/planaria-ontology/src/ontology/plana.obo'
>>> plana = pronto.Ontology(plana_path)
Traceback (most recent call last):
  File "/Users/smr/anaconda3/envs/py3.7/lib/python3.7/site-packages/pronto/parser/obo.py", line 234, in _classify
    s = _cached_synonyms[obo_header]
KeyError: '"zygotum" RELATED LATIN [http://en.wikipedia.org/wiki/Zygote]'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/smr/anaconda3/envs/py3.7/lib/python3.7/site-packages/pronto/synonym.py", line 115, in __init__
    self.syn_type = SynonymType._instances[syn_type]
KeyError: 'LATIN'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/smr/anaconda3/envs/py3.7/lib/python3.7/site-packages/pronto/ontology.py", line 109, in __init__
    self.parse(handle, parser)
  File "/Users/smr/anaconda3/envs/py3.7/lib/python3.7/site-packages/pronto/ontology.py", line 224, in parse
    self.meta, self.terms, self.imports, self.typedefs = p.parse(stream)
  File "/Users/smr/anaconda3/envs/py3.7/lib/python3.7/site-packages/pronto/parser/obo.py", line 70, in parse
    terms, typedefs = cls._classify(_rawtypedef, _rawterms)
  File "/Users/smr/anaconda3/envs/py3.7/lib/python3.7/site-packages/pronto/parser/obo.py", line 236, in _classify
    s = Synonym.from_obo(obo_header, scope)
  File "/Users/smr/anaconda3/envs/py3.7/lib/python3.7/site-packages/pronto/synonym.py", line 139, in from_obo
    return cls(**groupdict)
  File "/Users/smr/anaconda3/envs/py3.7/lib/python3.7/site-packages/pronto/synonym.py", line 118, in __init__
    raise ValueError("Undefined synonym type: {}".format(syn_type))
ValueError: Undefined synonym type: LATIN
srobb1 commented 4 years ago

turns out zygotum is a synonym of one of my terms that I import from uberon.

'zygote stage' (UBERON:0000106)

Screen Shot 2019-11-01 at 4 12 15 PM

althonos commented 4 years ago

@srobb1 : if you installed the library with conda, you're still using the old version! I have to update the bioconda recipe to get it to build the new rewrite. Use pip to install v1.1.2!

srobb1 commented 4 years ago

Great! It is working. Now I have to figure out how to navigate the terms generated by subclasses to create a tree in this format. Wish me luck!

Screen Shot 2019-11-05 at 3 27 29 PM

srobb1 commented 4 years ago

Ok. I have a question. I think I must be doing something wrong because I am not getting the results I expect.

Here is my OBO

Term('PLANA:0000524', name='organelle membrane')

edited this code block, I had a copy and paste error in the last version

>>> obo='plana.obo'
>>> ont = pronto.Ontology(owl)
>>> obo='plana.obo'
>>> ont = pronto.Ontology(obo)
>>> term = ont['PLANA:0000524']
>>> c = term.subclasses(1)
>>> for i in c:
...   print(i.name)
...
organelle membrane

I only get the organelle membrane back. Why don't I get back 'part of' some 'membrane-bounded organelle' and is_a membrane?

Screen Shot 2019-11-08 at 12 00 03 PM

althonos commented 4 years ago

The relationship is "organelle membrane" Subclass Of "membrane" so you may want to use the superclasses method, since you're actually looking for the superclasses of organelle membrane !

Furthermore, the part_of subclassing is an artifact created when translating an OBO file to OWL: you can find the other terms linked with a part_of relationship to organelle membrane using simply:

>>> ont['PLANA:0000524'].relationships[ont['part_of']]
frozenset({Term('PLANA:0000526', name='membrane-bounded organelle')})

Hope this helps!

srobb1 commented 4 years ago

I was confusing my self with the super and subclass one. Thanks!

I am still confused about the part of:

In your example, you know that 'PLANA:0000524' has a part of relationship to something, and you want to find out what.

What about if you have a term and you want to find out everything that is part of it?

How would I find out that Term('PLANA:0000524', name='organelle membrane') is part_of Term('PLANA:0000526', name='membrane-bounded organelle') if I don't have the ID of 'organelle membrane'?