biolink / ontobio

python library for working with ontologies and ontology associations
https://ontobio.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
123 stars 30 forks source link

GPI Parsing fails on zfin.gpi #85

Open dougli1sqrd opened 7 years ago

dougli1sqrd commented 7 years ago

If one creates a GpiParser and attempts to parse the uncompressed attached zfin.gpi file below, the following error occurs:

Traceback (most recent call last):
  File "tests/test_gpiparser.py", line 43, in <module>
    run_the_zfin_thing()
  File "tests/test_gpiparser.py", line 36, in run_the_zfin_thing
    results = p.parse(open("zfin.gpi", "r"))
  File "/Users/edouglass/lbl/biolink/ontobio/ontobio/io/entityparser.py", line 41, in parse
    parsed_line, new_ents  = self.parse_line(line)
  File "/Users/edouglass/lbl/biolink/ontobio/ontobio/io/entityparser.py", line 107, in parse_line
    properties] = vals

I used this code to run it:

def run_the_zfin_thing():
    ont = OntologyFactory().create("go-ontology.json")
    p = GpiParser()
    p.config.remove_double_prefixes = True
    results = p.parse(open("zfin.gpi", "r"))
    for r in results:
        print(r)

    print(p.report.to_markdown())

It's not clear yet if the file is at fault or if it's the parser. But in any case the parser should handle the wrong number of columns more gracefully.

Attached: zfin.gpi.zip

dougli1sqrd commented 7 years ago

So I ran goa_chicken_complex.gpi.zip using the same test, and this completes succesfully. This makes me think the zfin.gpi is off.

kltm commented 7 years ago

I'll pop over and we can take a look at them.

sierra-moxon commented 7 years ago

Is there a way I can run the validator myself? thanks,Sierra (@ZFIN).

kltm commented 7 years ago

re #86

kltm commented 7 years ago

@sierra-moxon It should just be the python biolink/ontobio package: https://github.com/biolink/ontobio

dougli1sqrd commented 7 years ago

Not an easy way as of yet. I can try and make that happen through the command line next.

kltm commented 7 years ago

@dougli1sqrd your code above already does this, right? https://github.com/biolink/ontobio/issues/85#issue-245260839

dougli1sqrd commented 7 years ago

@kltm @sierra-moxon Yeah it's true. If you copy that code into a file in ontobio, making sure to import those classes you could run it. But we will want a real command line entry into this. I'm working now to produce a command line entry point to this.

cmungall commented 7 years ago

Additionally, once registered we run a 'snapshot' dry-run release every day, so you'd get fast feedback that way.

On 25 Jul 2017, at 16:47, dougli1sqrd wrote:

@kltm @sierra-moxon Yeah it's true. If you copy that code into a file in ontobio, making sure to import those classes you could run it. But we will want a real command line entry into this. I'm working now to produce a command line entry point to this.

-- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/biolink/ontobio/issues/85#issuecomment-317905287

cmungall commented 7 years ago

@dougli1sqrd any reason @sierra-moxon cannot just run ontobio-parse-assocs.py?