galaxy-genome-annotation / python-apollo

Python library for talking to Apollo API
MIT License
11 stars 11 forks source link

arrow load_gff3 error #55

Open pwilx666 opened 2 years ago

pwilx666 commented 2 years ago

Hi, when I try to load a gff3 file using arrow:

arrow annotations load_gff3 --test "Uncinocarpus reesii 1704 [Aug 06, 2014]" "Uncinocarpis_reesei.gff"

I get the following error message:

Traceback (most recent call last): File "/home/pwilk/apollo_env/lib/python3.7/site-packages/arrow/decorators.py", line 16, in custom_exception return wrapped(*args, *kwargs) File "/home/pwilk/apollo_env/lib/python3.7/site-packages/arrow/decorators.py", line 46, in str_output print(wrapped(args, **kwargs)) File "/home/pwilk/apollo_env/lib/python3.7/site-packages/arrow/commands/annotations/load_gff3.py", line 51, in cli return ctx.gi.annotations.load_gff3(organism, gff3, source=source, batch_size=batch_size, test=test, use_name=use_name, disable_cds_recalculation=disable_cds_recalculation, timing=timing) File "/home/pwilk/apollo_env/lib/python3.7/site-packages/apollo/annotations/init.py", line 1345, in load_gff3 raise Exception("Organism name or id not found [" + organism + "]") Exception: Organism name or id not found [Uncinocarpus reesii 1704 [Aug 06, 2014]]

Organism name or id not found [Uncinocarpus reesii 1704 [Aug 06, 2014]]

The organism is in Apollo and the name is correct as I can use arrow's show_organism command to display information on this organism:

arrow organisms show_organism "Uncinocarpus reesii 1704 [Aug 06, 2014]" { "commonName": "Uncinocarpus reesii 1704 [Aug 06, 2014]", "blatdb": "/data/apollo_data/twoBit/uree1704.2bit", "metadata": "{\"creator\":\"32\"}", "annotationCount": 0, "currentOrganism": false, "obsolete": false, "sequences": 44, "directory": "/data/apollo_data/uree1704", "publicMode": true, "valid": true, "genomeFastaIndex": "seq/uree1704.fa.fai", "genus": null, "species": null, "id": 2498934, "nonDefaultTranslationTable": null, "genomeFasta": "seq/uree1704.fa" }

I've tried this with other organisms in my local apollo instance and they also fail, does anyone have any idea why this is happening? I'm using apollo v2.6.5

hexylena commented 2 years ago

Wow! Someone else is using this, very cool :) Have you tried using the internal number of the organism? You can find this from one of the other arrow commands.

pwilx666 commented 2 years ago

Yes, I already tried using the internal id and get the same error unfortunately:

arrow annotations load_gff3 --test "2498934" "Uncinocarpis_reesei.gff" Traceback (most recent call last): File "/home/pwilk/apollo_env/lib/python3.7/site-packages/arrow/decorators.py", line 16, in custom_exception return wrapped(*args, *kwargs) File "/home/pwilk/apollo_env/lib/python3.7/site-packages/arrow/decorators.py", line 46, in str_output print(wrapped(args, **kwargs)) File "/home/pwilk/apollo_env/lib/python3.7/site-packages/arrow/commands/annotations/load_gff3.py", line 51, in cli return ctx.gi.annotations.load_gff3(organism, gff3, source=source, batch_size=batch_size, test=test, use_name=use_name, disable_cds_recalculation=disable_cds_recalculation, timing=timing) File "/home/pwilk/apollo_env/lib/python3.7/site-packages/apollo/annotations/init.py", line 1345, in load_gff3 raise Exception("Organism name or id not found [" + organism + "]") Exception: Organism name or id not found [2498934]

Organism name or id not found [2498934]

It's weird as arrow works using the same commonName and internal id for other webservices.

hexylena commented 2 years ago

Very odd! You said this was a local Apollo, do the logs from Apollo give any useful indication? (You may need to set the logging configuration in apolloy-config.groovy)

pwilx666 commented 2 years ago

Hexylena, which version of arrow are you using? I've been trying and failing to upload the gff3 using arrow version 4.2.13 I found an old virtual environment on my machine that had arrow version 4.2.7 installed and when I tried the upload it works!

hexylena commented 2 years ago

Hey @pwilx666 I tried all versions between 4.2.7 and 4.2.13 and can't reproduce locally.

(.venv) 11:47:33|[hxr@cosima:/tmp/tmp.wPYnfmZJn1]130$ pip install apollo==4.2.13
Collecting apollo==4.2.13
...
Installing collected packages: apollo
  Attempting uninstall: apollo
    Found existing installation: apollo 4.2.12
    Uninstalling apollo-4.2.12:
      Successfully uninstalled apollo-4.2.12
Successfully installed apollo-4.2.13
(.venv) 11:47:35|[hxr@cosima:/tmp/tmp.wPYnfmZJn1]$ arrow annotations load_gff3 --test 'E. coli K12 (gx580)' ~/Downloads/augustus.gff3-NC_000913.3-1..100000.gff3
test success 79 features would have been loaded
test success 79 features would have been loaded
{}

I'm not sure what's going on there for you. You could either try the different versions in between 4.2.7 and 13 to figure out where it breaks, (pip install apollo=4.2.X) or check the apollo logs, maybe there would be an indication of the difference, because I'm looking through the code and there's absolutely nothing clear :(

hexylena commented 2 years ago

I wondered if it is due to the [Aug,,,] but no

$ arrow annotations load_gff3 --test 'E. coli Uncinocarpus reesii 1704 [Aug 06, 2014] (gx580)' ~/Downloads/augustus.gff3-NC_000913.3-1..100000.gff3
test success 79 features would have been loaded
test success 79 features would have been loaded
{}
abretaud commented 2 years ago

Odd, you're sure you're running arrow on the good server?