Griffan / VerifyBamID

VerifyBamID2: A robust tool for DNA contamination estimation from sequence reads using ancestry-agnostic method.
http://griffan.github.io/VerifyBamID/
92 stars 15 forks source link

Can VerifyBamID identify the source of contamination? #33

Open ManavalanG opened 2 years ago

ManavalanG commented 2 years ago

Can VerifyBamID be used to identify the source of contamination? That is, identifying the sample that is contaminating an intended sample.

For example, in a project with 100 EUR ancestry samples including sample X, let's assume that X is the source of contamination for 5 samples. VerifyBamID can be used to estimate the amount of contamination in those 5 samples, but can it also be used to identify which sample is the source of contamination?

Griffan commented 2 years ago

Ideally, this can be done by iteratively querying against each of these candidate samples. In practice, this might be related to the absolute depth and contamination level. This functionality has not been implemented yet.

yfarjoun commented 4 months ago

@ManavalanG once you have the contamination estimate, you can extract a fingerprint for the contamination using Picard's ExtractFingerprint and then use (also Picard's) CrosscheckFingerprints to identify the source of contamination.