MoseleyBioinformaticsLab / GOcats

A tool for categorizing Gene Ontology into subgraphs of user-defined emergent concepts
Other
7 stars 2 forks source link

can't parse current go.obo file #13

Closed rmflight closed 3 years ago

rmflight commented 3 years ago

I've tried using both the version on PyPy, and this GitHub version and see the same error each time.

import gocats.gocats as gc
graph = gc.build_graph_interpreter("go.obo")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/rmflight/Projects/work/macleod_lab/cartilage_analysis/py/gocats/gocats/gocats.py", line 73, in build_graph_interpreter
    go_parser.parse()
  File "/home/rmflight/Projects/work/macleod_lab/cartilage_analysis/py/gocats/gocats/ontologyparser.py", line 147, in parse
    properties = self.relationship_mapping[relationship_obj.id]
KeyError: 'term_tracker_item'

I checked, there is a property_value: term_tracker_item that has been introduced for tracking github issues aroung GO.

However, when I use sed to remove the lines containing this, I get a new error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/rmflight/Projects/work/macleod_lab/cartilage_analysis/py/gocats/gocats/gocats.py", line 73, in build_graph_interpreter
    go_parser.parse()
  File "/home/rmflight/Projects/work/macleod_lab/cartilage_analysis/py/gocats/gocats/ontologyparser.py", line 147, in parse
    properties = self.relationship_mapping[relationship_obj.id]
KeyError: ''

This is all using the most current version of go.obo downloaded from http://geneontology.org/docs/download-ontology/

ehinderer commented 3 years ago

I think I see what's happening here. It seems as though they've added some additional information to each stanza, including these tracker URLs and other properties like "created_by" and "creation_date." The problem is that they've added this "term_tracker_item" as a typedef, so that's making its way into the graph interpreter as a bona fide relation, but I don't have a mapping for that in the relationship_mapping dict.

For now, I can add it into the mapping dict and treat it like the other relations that are not fully supported (like "never_in_taxon") until I can figure out what this could even be used for. It looks like metadata.

rmflight commented 3 years ago

Yes, I think it is more like metadata, that could be used for more information about a term or where discussion has occurred on GitHub, so eventually it might be nice to have.

But short term, ignoring it as unsupported is a nice solution.

On Fri, Sep 25, 2020, 10:38 PM Eugene Hinderer notifications@github.com wrote:

I think I see what's happening here. It seems as though they've added some additional information to each stanza, including these tracker URLs and other properties like "created_by" and "creation_date." The problem is that they've added this "term_tracker_item" as a typedef, so that's making its way into the graph interpreter as a bona fide relation, but I don't have a mapping for that in the relationship_mapping dict.

For now, I can add it into the mapping dict and treat it like the other relations that are not fully supported (like "never_in_taxon") until I can figure out what this could even be used for. It looks like metadata.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/MoseleyBioinformaticsLab/GOcats/issues/13#issuecomment-699280540, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALQR6RGJWBX2KTKCQALNATSHVH35ANCNFSM4RXMGTJA .

ehinderer commented 3 years ago

@rmflight please check to see if the latest commit fixed the issue for you. If everything looks good, I'll close this and update the code on Pypi.

rmflight commented 3 years ago

I will check it out sometime Monday and get back to you.

Thanks @ehinderer !!

ehinderer commented 3 years ago

Any time!

rmflight commented 3 years ago

Based on downloading latest, and testing the graph reader functionality, I think it works. No errors when I parsed the latest version of go.obo.

ehinderer commented 3 years ago

I will update Pypi this week.