Closed ilyavs closed 10 months ago
We usually remove these samples. My expectation would be that it either joins these two clusters together, or forms a totally new cluster – depending on the nature of the contamination and the specific model fit used. I would look at this with some data (real or simulated), and in particular look at the core/accessory distances estimated by sketchlib to see if they are obviously outlying in a detectable/repeatable way
Thank you
Hi, This is more of a general question. Suppose I have a fastq file of a sample that has two or more strains of the same species, a contaminated sample. What could I expect from the assignment pipeline? Can it be used to detect the contamination? Thanks!