tanghaibao / goatools

Python library to handle Gene Ontology (GO) terms
BSD 2-Clause "Simplified" License
773 stars 211 forks source link

Not able to access all the paths from a term to the root #184

Open DhananjayKimothi opened 4 years ago

DhananjayKimothi commented 4 years ago

Hi,

I am using .paths_to_top method for accessing all paths from a specified GO term to the root, but can access only those paths where the terms share "is_a" relation.

Is there any other way to access the paths specifying the relationship.

Actually I want to find shortest path from the term to the root , the terms in the path can share ''is_a" or "part_of" relationship.

example:

download

term = "GO:0000932" paths = godag.paths_to_top(id)

As shown in figure the term has more than 3 paths, but len(paths) = 3 for now.

dvklopfenstein commented 3 years ago

Thank you for your interest in GOATOOLS and taking your time to write us.

The paths_to_top function was created in 2013 before the obo_parser could handle optional relationships. I believe that new paths code is available elsewhere in GOATOOLS which will give you the paths to the top. Let me look into it.

The GODag class either reads no optional relationships or all optional relations, if requested by the researcher using:

from goatools.base import get_godag
godag = get_godag("go-basic.obo", optional_attrs={'relationship'})

Reading optional relationships is done on request because it slightly slows down the reading of the go-basic.obo, so we did not want to make it the default.

The GoSubDag class can then just use specific optional relationships, like part_of or all optional relationships or no optional relationships.

I will look into this. Thank you for your patience in getting back to you. I was working on a publication, which is now out. There are still other deadlines that I must meet for other projects, so thank you for your patience.

And thanks for the great question.

I am looking into it.