katholt / srst2

Short Read Sequence Typing for Bacterial Pathogens
Other
123 stars 65 forks source link

More duplicates in EcOH.fasta #83

Closed tseemann closed 3 years ago

tseemann commented 7 years ago

eg. different O-type, same sequence

0       1236nt, >8__wzx__wzx-O17-Gp9__203... *
1       1236nt, >8__wzx__wzx-O77-Gp9__293... at +/100.00%
cd-hit-est -d 0 -i EcOH.fasta -c 1 -g 1 -o cdhit && less cdhit.clstr

What did you do last time @rrwick when you removed the duplicates?

Did you actually manage to figure out the correct "O" type for the conflicts?