Closed alantsangmb closed 5 years ago
Hi @alantsangmb,
I don't recall what happened with those antigen sequences, but I think I may have given up on trying to assign them O-antigens based on the serovar of the genome these sequences were derived from.
For serogroups E1 and E4, the sequences for different alleles of wzx or wzy tend to be >99% similar so confidently assigning a genome to either E1 or E4 is difficult strictly based on the wzx/wzy gene results.
We're planning on updating the antigen gene databases and adding more genomic data into the next version of SISTR, so this curation issue should be fixed in the update, and hopefully, in the meantime, this shouldn't cause any issues with serovar predictions.
I see. Thank you.
Hi, @peterk87
In the wzx.fasta file, I found that the ref seq 407 and 408 were named as "407|584|1,3,19|E1" and "408|584|1,3,19|E1". I am not sure whether it is a transcription error. Is it actually referring to E4 instead?