brentp / duphold

don't get DUP'ed or DEL'ed by your putative SVs.
MIT License
101 stars 9 forks source link

calling copy number based on duphold values #31

Open lee039 opened 4 years ago

lee039 commented 4 years ago

Hi,

I used duphold to filter out false positives and it works pretty well. Thanks a lot for making this nice programme!

I was thinking that Duphold can be useful in studying duplications with multiple levels of copies (usually called multi-allelic CNVs, mCNVs). For instance, DHBFC 2.5 would mean 5 copies or DHBFC 3.5 would mean 7 copies. However, sometimes DHBFC can be somewhat intermediate like 2.22... which could mean 4.5 copies, and it does not make sense, as copy numbers should be discrete numbers.

So I was wondering whether you have some ideas on this... or plan to add additional features to call (diploid) copy numbers using DHBFC... :)

Lim

brentp commented 4 years ago

Hi Lim, given the variability in sequencing coverage, it's hard to call exact copy-number. You can certainly do some rounding or estimation from the duphold values, but beyond copy-number 4 (or even 3), it will get harder.

lee039 commented 4 years ago

Right. I indeed observed in my dataset, that high coverage samples' DHBFC values could easily be rounded up to discrete copy numbers. However, rounding up of DHBFC values (and to assign discrete copy numbers) from low coverage samples become difficult. Thanks anyways!