eggnogdb / eggnog-mapper

Fast genome-wide functional annotation through orthology assignment
http://eggnog-mapper.embl.de
GNU Affero General Public License v3.0
570 stars 106 forks source link

--predict_ortho acting up and writing orthologs to file not working + No GO ids appear in .annotations file #213

Closed selveyad closed 4 years ago

selveyad commented 4 years ago

Howdy,

I am receiving the error below:

Traceback (most recent call last):
  File "/home/aselvey/miniconda3/envs/eggnog/bin/emapper.py", line 1213, in <module>
    main(args)
  File "/home/aselvey/miniconda3/envs/eggnog/bin/emapper.py", line 279, in main
    dump_orthologs(seed_orthologs_file, orthologs_file, args)
  File "/home/aselvey/miniconda3/envs/eggnog/bin/emapper.py", line 812, in dump_orthologs
    for result in pool.imap(find_orthologs_per_hit, iter_hit_lines(seed_orthologs_file, args)):
  File "/home/aselvey/miniconda3/envs/eggnog/lib/python2.7/multiprocessing/pool.py", line 673, in next
    raise value
TypeError: 'NoneType' object has no attribute '__getitem__'

It occurs right after the functional annotation of refined hits finishes.

Here is my original command:

emapper.py -i ../trinity_assembly/Trinity.fasta.transdecoder_predict_all/Trinity.fasta.transdecoder.predict.all.pep --output T_td_pa_eggNOG -m diamond --cpu 16 --keep_mapping_files --target_taxa 'Lepidoptera' --target_orthologs all --predict_ortho --output_dir eggnog/

I was having trouble with the --report_orthologs option earlier. I removed that and got this error. I will remove --predict_ortho next and see what happens.

Also, if I want to resume a run, can I just put in the DIAMOND search results as my input file? I feel like I have been re-running the program for no reason haha.

For anyone out there that had troubles with ete3 and the --target_taxa argument, you have to put the target taxa in ' ' (single quotes, no spaces) (ex. 'Lepidoptera') for it to work.

All help is greatly appreciated!

Best Regards,

Alex

** Update: emapper runs without error if --report_orthologs and --predict_ortho are not utilized. Whenever the patch for these options comes out I would really like to utilize them.

selveyad commented 4 years ago

New issue!

There are no GO ids in the GO column of the .annotations output. Same code as above was used. I am running the exact same command into a fresh outdir, will update once it is finished.

** Update: emapper output GO ids, but out of 28266, only annotated 78. My BUSCO score is 95.4% with over 80% single-copy. This is definitely not right. emapper should, at the very least, have picked up the 5000 or so BUSCO genes in the Lepidoptera set present in my transcriptome for sure. Not sure what is going on here. I utilized EnTAP for annotation as well with the v5.0.0 proteins and the v4.5.1 database, got 21938/28266 transcripts with GO hits. Far more reasonable.

Cantalapiedra commented 4 years ago

Hi,

it could happen that you are not getting orthologs with GO annotation for the target taxa you are using?

Cantalapiedra commented 4 years ago

** Update: emapper runs without error if --report_orthologs and --predict_ortho are not utilized. Whenever the patch for these options comes out I would really like to utilize them.

'--report_orthologs' should work in the "refactor" branch