katholt / srst2

Short Read Sequence Typing for Bacterial Pathogens
Other
123 stars 65 forks source link

VFDB_cdhit_to_csv.py error #88

Closed Wang-Jing-NZ closed 7 years ago

Wang-Jing-NZ commented 7 years ago

when I followed the instruction (https://github.com/katholt/srst2/tree/master/database_clustering) to generate a fasta file for Salmonella. I got a KeyError when I ran "VFDB_cdhit_to_csv.py' Traceback (most recent call last): File "VFDB_cdhit_to_csv.py", line 66, in sys.exit(main()) File "VFDB_cdhit_to_csv.py", line 59, in main clusterid = seq2cluster[seqID] KeyError: 'VFG000423gb|NP_458730'

Checked the .fsa file it has ">VFG000423(gb|NP_458730)" and .clstr file it has got a line: 1971nt, >VFG000423(gb|NP_458... *

(I also found the instructions on this page is different with the one on the main page: cd-hit or cd-hit-est)

Am I on right track? Any help would be much appreciated.

Jing

Wang-Jing-NZ commented 7 years ago

I did some search and readings last night. It turned out that I used an old version of srst2 0.1.5. I need to get the latest version 0.2.0 to have a go. Jing

Wang-Jing-NZ commented 7 years ago

Works in version 0.2.0. Close this issue.