peterthorpe5 / public_scripts

collection of bioinformatic scripts
30 stars 23 forks source link

Is that possible to use the Diamond_blast_to_taxid.py script to annotate diamond tab result without gi number #4

Closed LujingF closed 7 years ago

LujingF commented 7 years ago

Hi, I cannot use Diamond_blast_to_taxid.py to annotate diamond tab result because of lacking GI number information. Last year, the NCBI didn't support the GI number. I use the current version nr database, and now I cannot use this script to get taxid anymore. In the newest version of nr, the accession number supported. So, Is that possible to use this script to get taxid?

peterthorpe5 commented 7 years ago

Dear LujingF, The latest version of the program should work with accession numbers (the current NCBI) format. Is this not working for you? If not please send me the top few lines of your BLAST output.

Please update the script you are running with the current version. Also, it requires you to install matplotlib to graph the top hits.

Pete

peterthorpe5 commented 7 years ago

Also, you need to update the database it looks at by running this: https://github.com/peterthorpe5/public_scripts/blob/master/Diamond_BLAST_add_taxonomic_info/prepare_accession_to_description_db.py

LujingF commented 7 years ago

many thanks, I will try it, thanks again @peterthorpe5

peterthorpe5 commented 7 years ago

did that work for you?

LujingF commented 7 years ago

Sorry, I haven't try this. I found the annotation speed was too slow for me and this script cost much computer memory . Because my PC always send me memory limit error. So I decided to use mysql to species annotation like b2gpipe. Anyway, many thanks.

peterthorpe5 commented 7 years ago

Yes, the program is very RAM hungry. Glad you got it sorted. cheers, Pete