vplagnol / ExomeDepth

ExomeDepth R package for the detection of copy number variants in exomes and gene panels using high throughput DNA sequencing data.
59 stars 26 forks source link

correlation threshold #27

Closed rajitz closed 4 years ago

rajitz commented 4 years ago

Hi,

A couple of questions regarding the ExomeDepth correlation threshold 0.97:

• How did you go about getting the threshold of 0.97 for correlation of read counts between the test and reference samples? As the vignette says, if the correlation is lower than 0.97, “consider the output of ExomeDepth as less reliable (i.e. most likely a high false positive rate)”.

• Do you have any data on how well ExomeDepth performs with correlations less than 0.97, in terms of sensitivity and specificity? In other words, at what point does the performance really start to taper off?

Thanks very much, Rajat

vplagnol commented 4 years ago

Rajat,

It is a difficult question to answer. The issue with exome and targeted sequencing data in general is that there is tremendous diversity across laboratories, assays... So it is very difficult to make a generic statement that will make all users confident about their data. I looked at many exome/targeted datasets and I found that, after applying the filters for the common variants (that Conrad et al dataset) I was in the range of ~ 150-200 CNVs per sample, which felt in general right.

However, over time, some users reported good and convincing results with lower correlations. And some targeted assays can have great count correlations but massively power the CNV detection power, especially PCR based assays that create some sort of saturation, hence removing the contribution of DNA input and equalising everything (which is good for some purposes, just not CNV detection).

OK, long winded answer to suggest that, based on the technology in use in your lab, you should establish what you think is appropriate cutoff. That 0.97 is a rule of thumb but not a definite rule in any case. I apologise if I did not make it clear enough in the documentation.

Let me know if that makes sense!

rajitz commented 4 years ago

That makes sense - thank you!