Closed limin321 closed 4 years ago
hi @limin321 , what's the emapper version you're using? Also, I think it would help the developers if you could share the problematic FFN file
hi @limin321 , what's the emapper version you're using? Also, I think it would help the developers if you could share the problematic FFN file
Hi Ali,
Thank you so much for replying. I attached the Bac.txt file because .ffn extension is not supported so change it to txt. When you run test, you can change it back to .ffn or .fasta. Here is the version I have used diamond v0.9.24.125 emapper-2.0.1b-2-g816e190
I did file containing more than 200,000 cds, and only around 2000 were annotate.
Thank you so much for trying to help. Best,
Hi @limin321, sorry if there was a misunderstanding, unfortunately I cannot help you on-hands with this, but I thought the developers can if the right input and parameters are provided (which you did). So let's wait and see if they can.
Hi @limin321, sorry if there was a misunderstanding, unfortunately I cannot help you on-hands with this, but I thought the developers can if the right input and parameters are provided (which you did). So let's wait and see if they can.
Thanks. Hope the developer could see my message soon.
Hi @limin321 ,
and thank you very much @alimayy
Sorry for the late response.
If you really want to retrieve the list of orthologs I would recommend cloning the version in the "refactor" branch, which is anyway the version we are going to merge with the master one very soon.
emapper.py --version emapper-2.0.2-rf1-87-gfcc6955 / Expected eggNOG DB version: 5.0.1 / Installed eggNOG DB version: 5.0.1 / Local diamond version: diamond version 2.0.4 / Local MMseqs2 version: 113e3212c137d026e297c7540e1fcd039f6812b1
I tested your input file with such version (using --itype CDS instead of --translate, and diamond in default sens mode) and it worked fine.
emapper.py -i Bac.txt --itype CDS -m diamond --report_orthologs --output tmp_limin --output_dir tmp_limin --cpu 10
I hope this helps.
Best, Carlos
Please, re-open or re-issue if need further help.
Please, re-open or re-issue if need further help.
Hi Carlos,
Thank you so much. I have been trying the version in the "refactor" branch. Basically, I download the source code under this link: https://github.com/eggnogdb/eggnog-mapper/releases/tag/2.0.2-rf1
then I upload it to the server because run my data on a server. However, when I try to check the version of emapper.py, I got the following error message. Do you have any suggestion what I did wrong ??
[limin.chen@ceres eggnog-mapper-2.0.2-rf1]$ python emapper.py --version File "emapper.py", line 42 help=f'Input FASTA file containing query sequences (proteins by default; see --translate). Required unless -m {SEARCH_MODE_NO_SEARCH}') ^ SyntaxError: invalid syntax
Why it did not return any version information?
Best, Limin
Hi @limin321 ,
It is raising an error, and that is why you don't see the version info. The error could be due to using a python version below 3.6, and therefore not recognizing the syntax for f-strings. Which python version are you using?
Carlos
Hi @limin321 ,
It is raising an error, and that is why you don't see the version info. The error could be due to using a python version below 3.6, and therefore not recognizing the syntax for f-strings. Which python version are you using?
Carlos
Hi Carlos,
Even my python version is 3.7, I still get error messages. Here is the details.
(base) KluepfelLabMBP01:eggnog-mapper-2.0.2-rf1 dklabuser$ python3.7 emapper.py --version
Traceback (most recent call last):
File "emapper.py", line 412, in
Any suggestions?
Best, Limin
Hi @limin321 ,
Did you run the script to download the eggnog-mapper databases (download_eggnog_data.py)? Maybe the error you have is because of that. The refactor version uses a new version of the database.
Best, Carlos
Hi @limin321 ,
Did you run the script to download the eggnog-mapper databases (download_eggnog_data.py)? Maybe the error you have is because of that. The refactor version uses a new version of the database.
Best, Carlos
Hi Carlos
I also run into issues when downloading the database, here is my code and error message.
[limin.chen@ceres eggnog-mapper-2.0.2-rf1]$ python3 download_eggnog_data.py
Download main annotation database? [y,n] y
Traceback (most recent call last):
File "download_eggnog_data.py", line 93, in
I don't understand 'y' is a string, why does it need to be defined?
Best, Limin
Hi @limin321 ,
The next line from your output is from a previous version of the refactor branch: v = eval(input("%s [%s] " % (string,','.join(valid_values) )))
Unfortunately, the refactor branch is under development, and the tag you downloaded is just a tag to track some changes, but should not be considered as a proper release. Sorry for the inconvenience. I recommend you downloading the current version of the branch with git clone:
git clone -b refactor https://github.com/eggnogdb/eggnog-mapper.git
or
git clone --single-branch --branch refactor https://github.com/eggnogdb/eggnog-mapper.git
I hope that downloading such version you will be able to download the databases and give it a try.
Thank you.
Best, Carlos
Hi @limin321 ,
The next line from your output is from a previous version of the refactor branch: v = eval(input("%s [%s] " % (string,','.join(valid_values) )))
Unfortunately, the refactor branch is under development, and the tag you downloaded is just a tag to track some changes, but should not be considered as a proper release. Sorry for the inconvenience. I recommend you downloading the current version of the branch with git clone:
git clone -b refactor https://github.com/eggnogdb/eggnog-mapper.git
or
git clone --single-branch --branch refactor https://github.com/eggnogdb/eggnog-mapper.git
I hope that downloading such version you will be able to download the databases and give it a try.
Thank you.
Best, Carlos
Hi Carlos, Thank you so much for the codes. With the two codes you provided, I am able to download the database.
However, when I look at the version, still it has error:
[limin.chen@ceres eggnog-mapper]$ python3 emapper.py --version
There was an error retrieving eggnog-mapper DB data: not a valid file "/KEEP/cpgru_targetedseq/eggnog-mapper/data/eggnog.db"
Maybe you need to run download_eggnog_data.py
Traceback (most recent call last):
File "emapper.py", line 552, in
I also tried --help argument, seems to me it works. I am testing now. [limin.chen@ceres eggnog-mapper]$ python3 emapper.py --help usage: emapper.py [-h] [-v] [--list_taxa] [--cpu NUM_CPU] [-i FASTA_FILE] [--itype {CDS,proteins,genome,metagenome}] [--translate] [--annotate_hits_table SEED_ORTHOLOGS_FILE] [-c FILE] [--data_dir DIR] [--genepred {search,prodigal}] [-m {diamond,mmseqs,hmmer,no_search,cache}] [--pident PIDENT] [--query_cover QUERY_COVER] [--subject_cover SUBJECT_COVER] [--evalue EVALUE] [--score SCORE] [--dmnd_db DMND_DB_FILE] [--sensmode {fast,mid-sensitive,sensitive,more-sensitive,very-sensitive,ultra-sensitive}] [--matrix {BLOSUM62,BLOSUM90,BLOSUM80,BLOSUM50,BLOSUM45,PAM250,PAM70,PAM30}]
Will let you know if it will work on my data. Best, Limin
Hi @limin321 ,
The next line from your output is from a previous version of the refactor branch: v = eval(input("%s [%s] " % (string,','.join(valid_values) )))
Unfortunately, the refactor branch is under development, and the tag you downloaded is just a tag to track some changes, but should not be considered as a proper release. Sorry for the inconvenience. I recommend you downloading the current version of the branch with git clone:
git clone -b refactor https://github.com/eggnogdb/eggnog-mapper.git
or
git clone --single-branch --branch refactor https://github.com/eggnogdb/eggnog-mapper.git
I hope that downloading such version you will be able to download the databases and give it a try.
Thank you.
Best, Carlos
Hi Carlos,
I tested my own data and using the version downloaded with your recommended code:
git clone --single-branch --branch refactor https://github.com/eggnogdb/eggnog-mapper.git
It fails with the similar error as I run --version command.
Traceback (most recent call last):
File "/KEEP/cpgru_targetedseq/limin/eggnog-mapper/emapper.py", line 563, in
Any thoughts on this error?
Best, Limin
Hi @limin321 ,
The next line from your output is from a previous version of the refactor branch: v = eval(input("%s [%s] " % (string,','.join(valid_values) )))
Unfortunately, the refactor branch is under development, and the tag you downloaded is just a tag to track some changes, but should not be considered as a proper release. Sorry for the inconvenience. I recommend you downloading the current version of the branch with git clone:
git clone -b refactor https://github.com/eggnogdb/eggnog-mapper.git
or
git clone --single-branch --branch refactor https://github.com/eggnogdb/eggnog-mapper.git
I hope that downloading such version you will be able to download the databases and give it a try.
Thank you.
Best, Carlos
Hi Carlos,
When I tried using the web version of eggnog, it is able to annotate the genome I submitted as AA sequence. When I try to annotate using command line providing with AA sequence, I still run into the same error as before. It failed after running 40 mins, please see blow:
[limin.chen@ceres Yub001]$ tail eggnog_error.txt
Reported 21037 pairwise alignments, 21046 HSPs.
5664 queries aligned.
Traceback (most recent call last):
File "/KEEP/cpgru_targetedseq/limin/eggnog-mapper1/emapper.py", line 1216, in
what is the version of online resource: http://eggnog-mapper.embl.de/
Thank you so much. Best, Limin
Hi @limin321 , The next line from your output is from a previous version of the refactor branch: v = eval(input("%s [%s] " % (string,','.join(valid_values) ))) Unfortunately, the refactor branch is under development, and the tag you downloaded is just a tag to track some changes, but should not be considered as a proper release. Sorry for the inconvenience. I recommend you downloading the current version of the branch with git clone: git clone -b refactor https://github.com/eggnogdb/eggnog-mapper.git or git clone --single-branch --branch refactor https://github.com/eggnogdb/eggnog-mapper.git I hope that downloading such version you will be able to download the databases and give it a try. Thank you. Best, Carlos
Hi Carlos,
I tested my own data and using the version downloaded with your recommended code: git clone --single-branch --branch refactor https://github.com/eggnogdb/eggnog-mapper.git It fails with the similar error as I run --version command. Traceback (most recent call last): File "/KEEP/cpgru_targetedseq/limin/eggnog-mapper/emapper.py", line 563, in emapper.run(args, args.input, args.annotate_hits_table, args.cache_file) File "/KEEP/cpgru_targetedseq/limin/eggnog-mapper/eggnogmapper/emapper.py", line 240, in run self.searcher = self.search(args, queries_file) File "/KEEP/cpgru_targetedseq/limin/eggnog-mapper/eggnogmapper/emapper.py", line 123, in search pjoin(self._current_dir, self.hmm_hits_file)) File "/KEEP/cpgru_targetedseq/limin/eggnog-mapper/eggnogmapper/search/diamond/diamond.py", line 76, in search return self._search(in_file, seed_orthologs_file) File "/KEEP/cpgru_targetedseq/limin/eggnog-mapper/eggnogmapper/search/diamond/diamond.py", line 92, in _search raise e File "/KEEP/cpgru_targetedseq/limin/eggnog-mapper/eggnogmapper/search/diamond/diamond.py", line 87, in _search cmd = self.run_diamond(in_file, output_file) File "/KEEP/cpgru_targetedseq/limin/eggnog-mapper/eggnogmapper/search/diamond/diamond.py", line 147, in run_diamond completed_process = subprocess.run(cmd, capture_output=True, check=True, shell=True) File "/software/7/apps/python_3/3.6.6/lib/python3.6/subprocess.py", line 403, in run with Popen(*popenargs, kwargs) as process: TypeError: init**() got an unexpected keyword argument 'capture_output'
Any thoughts on this error?
Best, Limin
Hi @limin321 ,
sorry, my bad. I did some tests and it seems that current version requires at least python 3.7 ("capture_output" was added in python 3.7). You may need to install python 3.7 or greater, for example using conda:
conda create -n py370 python=3.7.0 conda activate py370 conda install biopython=1.76 psutil=5.7.0
or using pip install:
conda create -n py370 python=3.7.0 conda activate py370 pip install -r requirements.txt
Best, Carlos
Hi @limin321 , The next line from your output is from a previous version of the refactor branch: v = eval(input("%s [%s] " % (string,','.join(valid_values) ))) Unfortunately, the refactor branch is under development, and the tag you downloaded is just a tag to track some changes, but should not be considered as a proper release. Sorry for the inconvenience. I recommend you downloading the current version of the branch with git clone: git clone -b refactor https://github.com/eggnogdb/eggnog-mapper.git or git clone --single-branch --branch refactor https://github.com/eggnogdb/eggnog-mapper.git I hope that downloading such version you will be able to download the databases and give it a try. Thank you. Best, Carlos
Hi Carlos,
When I tried using the web version of eggnog, it is able to annotate the genome I submitted as AA sequence. When I try to annotate using command line providing with AA sequence, I still run into the same error as before. It failed after running 40 mins, please see blow: [limin.chen@ceres Yub001]$ tail eggnog_error.txt Reported 21037 pairwise alignments, 21046 HSPs. 5664 queries aligned. Traceback (most recent call last): File "/KEEP/cpgru_targetedseq/limin/eggnog-mapper1/emapper.py", line 1216, in main(args) File "/KEEP/cpgru_targetedseq/limin/eggnog-mapper1/emapper.py", line 275, in main annotate_hits_file(seed_orthologs_file, annot_file, hmm_hits_file, args) File "/KEEP/cpgru_targetedseq/limin/eggnog-mapper1/emapper.py", line 753, in annotate_hits_file print >>ORTHOLOGS, '\t'.join(map(str, (query_name, ','.join(orthologs)))) TypeError
what is the version of online resource: http://eggnog-mapper.embl.de/
Thank you so much. Best, Limin
The web version is using: emapper-1.0.3-35-g63c274b I would recommend you keep trying with the refactor version and python 3.7 or above.
Best, Carlos
Hi @limin321 , The next line from your output is from a previous version of the refactor branch: v = eval(input("%s [%s] " % (string,','.join(valid_values) ))) Unfortunately, the refactor branch is under development, and the tag you downloaded is just a tag to track some changes, but should not be considered as a proper release. Sorry for the inconvenience. I recommend you downloading the current version of the branch with git clone: git clone -b refactor https://github.com/eggnogdb/eggnog-mapper.git or git clone --single-branch --branch refactor https://github.com/eggnogdb/eggnog-mapper.git I hope that downloading such version you will be able to download the databases and give it a try. Thank you. Best, Carlos
Hi Carlos, When I tried using the web version of eggnog, it is able to annotate the genome I submitted as AA sequence. When I try to annotate using command line providing with AA sequence, I still run into the same error as before. It failed after running 40 mins, please see blow: [limin.chen@ceres Yub001]$ tail eggnog_error.txt Reported 21037 pairwise alignments, 21046 HSPs. 5664 queries aligned. Traceback (most recent call last): File "/KEEP/cpgru_targetedseq/limin/eggnog-mapper1/emapper.py", line 1216, in main(args) File "/KEEP/cpgru_targetedseq/limin/eggnog-mapper1/emapper.py", line 275, in main annotate_hits_file(seed_orthologs_file, annot_file, hmm_hits_file, args) File "/KEEP/cpgru_targetedseq/limin/eggnog-mapper1/emapper.py", line 753, in annotate_hits_file print >>ORTHOLOGS, '\t'.join(map(str, (query_name, ','.join(orthologs)))) TypeError what is the version of online resource: http://eggnog-mapper.embl.de/ Thank you so much. Best, Limin
The web version is using: emapper-1.0.3-35-g63c274b I would recommend you keep trying with the refactor version and python 3.7 or above.
Best, Carlos
Hi Carlos,
You are right. I just tested one sample and it works after using python 3.7.4.
When I use the master branch version, there is a column called "COG Functional cat."; Now I used the refactor version, the output doesn't have that column. Following is my command, will that because I didn't include --itype proteins. I want the output to have that column.
python ./eggnog-mapper/emapper.py -i ./Yub001.faa --data_dir /KEEP/cpgru_targetedseq/limin/eggnog-mapper/data -m diamond --report_orthologs --output Yub001 --output_dir Yub001_AA --cpu 20 --override
Thank you so much.
Hi Limin,
not sure if columns "narr_og_cat" and "best_og_cat" (columns 7 and 10) is what you are looking for.
Best, Carlos
Hi Limin,
not sure if columns "narr_og_cat" and "best_og_cat" (columns 7 and 10) is what you are looking for.
Best, Carlos
Thank you so much. Carlos, Yes, those are the two I want. Just one more quick question. If the annotation in "best_og_cat" is different from "narr_og_cat", does that mean the one in "best_og_cat" is more reliable than "narr_og_cat" considering it is called "best"?
Thank you so much for all the help. Really appreciate it. Best, Limin
Hi @limin321 ,
I guess "best_og" is not a very good name. Maybe it should be called "Annotation OG" or similar.
The difference between "best_og" and "narr_og" is:
For example, if one of your queries hits a protein called "COG0012", this would be now the seed ortholog. If you search the "COG0012" protein in http://eggnog5.embl.de/ you will find that it belongs to 5 OGs. The "narr_og" would be the one from "Rhizobiaceae", whereas the "best_og" could be at the "Root", "Bacteria", "Proteobacteria", "Alphaproteobacteria" or "Rhizobiaceae" levels, depending on the --tax_scope parameter.
I hope this makes sense.
And thanks to you for your patience. I am glad to try to help.
Best, Carlos
Hi Carlos,
Thank you for this excellent example, making it easy to understand clearly. Thank you for developing these tools, making annotation life much easier.
Best Regards! Limin
Glad to help. And thanks to you. Best, Carlos
Hi, Thank you for developing this nice annotation tool.
I first tested one XX.ffn file, which contains 5618 CDS nucleotide sequence. The annotation file generated contained 5305 proteins. That is, 5618-5305 = 313 CDS failed to be annotated by emapper.py. Is it correct that not all CDS could be annotated?
I assumed that is the case, so I set up run a batch of 35 bacterial XX.ffn files using emapper.py. The script I used was following.
Each genome had three output files: agro.emapper.annotations.csv agro.emapper.annotations.orthologs agro.emapper.seed_orthologs
When I look at carefully, 20 out of 35 annotations.csv files has the size greater than 1 Mb. 15 out of 35 annotations.csv files only had less than 500 kb size. Some file even just had one CDS annotated and for some reason, it just stopped.
So I looked at the eggnog_error.txt file, one type of error message is like this: "Traceback (most recent call last): File "//eggnog-mapper/emapper.py", line 1216, in
main(args)
File "/eggnog-mapper/emapper.py", line 275, in main
annotate_hits_file(seed_orthologs_file, annot_file, hmm_hits_file, args)
File "/eggnog-mapper/emapper.py", line 753, in annotate_hits_file
print >>ORTHOLOGS, '\t'.join(map(str, (query_name, ','.join(orthologs))))
TypeError
"
Can anyone have any idea what goes wrong of my codes? How should I fix this problem?
Thank you so much for any help.
Best, Limin