chrisquince / DESMAN

De novo Extraction of Strains from MetAgeNomes
Other
69 stars 22 forks source link

Could you provide a new ClassifyContigNR.py for new version Diamond search result? #2

Closed liuxianghui closed 7 years ago

liuxianghui commented 7 years ago

[xianghui@merlion Map]$ python /data/xianghui/software/DESMAN/scripts/ClassifyContigNR.py final_contigs_gt1000_c10K_nr.m8 final_contigs_gt1000_c10K.len -o final_contigs_gt1000_c10K_nr -l /data/xianghui/software/DESMAN/Example/all_taxa_lineage_notnone.tsv -g /data/xianghui/software/DESMAN/Example/gi_taxid_prot.dmp Namespace(blast_input_file='final_contigs_gt1000_c10K_nr.m8', gid_taxaid_mapping_file='/data/xianghui/software/DESMAN/Example/gi_taxid_prot.dmp', lineage_file='/data/xianghui/software/DESMAN/Example/all_taxa_lineage_notnone.tsv', output_dir='final_contigs_gt1000_c10K_nr', query_length_file='final_contigs_gt1000_c10K.len') final_contigs_gt1000_c10K_nr.m8 None Traceback (most recent call last): File "/data/xianghui/software/DESMAN/scripts/ClassifyContigNR.py", line 269, in main(sys.argv[1:]) File "/data/xianghui/software/DESMAN/scripts/ClassifyContigNR.py", line 149, in main (matches,gids) = read_blast_input(args.blast_input_file,lengths) File "/data/xianghui/software/DESMAN/scripts/ClassifyContigNR.py", line 43, in read_blast_input gid = m.group(1) AttributeError: 'NoneType' object has no attribute 'group'

Then I found it is because of the diamond search result. The format in your script is:

k191_83_2 gi|973180054|gb|KUL19018.1| 71.2 73 21 0 9 81 337 409 6.6e-24 118.2

and my diamond search result using the latest version of diamond is

k99_58_1 WP_061157995.1 100.0 105 0 0 1 105 195 299 4.1e-51 209.1

chrisquince commented 7 years ago

I am not sure if the issue is the latest version of diamond but rather that you may be using a different version of the NCBI NR. Could you rerun diamond using my database? Download as:

wget http://nrdatabase.s3.climb.ac.uk/nr.dmnd

liuxianghui commented 7 years ago

Thanks. Yes it shall be related to the new version of NR database which give up the usage of gi. I will try your NR. Xianghui

Sent from my iPhone

On 3 Oct 2016, at 3:55 AM, chrisquince notifications@github.com wrote:

I am not sure if the issue is the latest version of diamond but rather that you may be using a different version of the NCBI NR. Could you rerun diamond using my database? Download as:

wget http://nrdatabase.s3.climb.ac.uk/nr.dmnd

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.