when I followed the instruction (https://github.com/katholt/srst2/tree/master/database_clustering) to generate a fasta file for Salmonella. I got a KeyError when I ran "VFDB_cdhit_to_csv.py'
Traceback (most recent call last):
File "VFDB_cdhit_to_csv.py", line 66, in
sys.exit(main())
File "VFDB_cdhit_to_csv.py", line 59, in main
clusterid = seq2cluster[seqID]
KeyError: 'VFG000423gb|NP_458730'
Checked the .fsa file it has ">VFG000423(gb|NP_458730)" and .clstr file it has got a line:
1971nt, >VFG000423(gb|NP_458... *
(I also found the instructions on this page is different with the one on the main page: cd-hit or cd-hit-est)
Am I on right track? Any help would be much appreciated.
I did some search and readings last night. It turned out that I used an old version of srst2 0.1.5. I need to get the latest version 0.2.0 to have a go.
Jing
when I followed the instruction (https://github.com/katholt/srst2/tree/master/database_clustering) to generate a fasta file for Salmonella. I got a KeyError when I ran "VFDB_cdhit_to_csv.py' Traceback (most recent call last): File "VFDB_cdhit_to_csv.py", line 66, in
sys.exit(main())
File "VFDB_cdhit_to_csv.py", line 59, in main
clusterid = seq2cluster[seqID]
KeyError: 'VFG000423gb|NP_458730'
Checked the .fsa file it has ">VFG000423(gb|NP_458730)" and .clstr file it has got a line: 1971nt, >VFG000423(gb|NP_458... *
(I also found the instructions on this page is different with the one on the main page: cd-hit or cd-hit-est)
Am I on right track? Any help would be much appreciated.
Jing