ssadedin / ximmer

Ximmer is a system for CNV calling on exome and targeted genomic sequencing
http://ssadedin.github.io/ximmer/
GNU Lesser General Public License v2.1
19 stars 10 forks source link

parameter to take in count relatedness in data #10

Open gybeata opened 6 years ago

gybeata commented 6 years ago

Thank you for this very useful tool! We are running a few tests using xhmm, cn.mops and exomedepth on a cohort consisting of 24 individual samples and 4 related samples. Ideally these 4 related samples should not be used for building the reference in any of the tools. Is there any way to control this through the parameters we supply to ximmer ? Thank you in advance for your answer.

ssadedin commented 6 years ago

It's a really good point, and something that I am actively thinking about how to include. I have been thinking about it from the broader point of view of sample reference selection - for example, for good calling on the sex chromosomes one could match samples to references of the same sex (if you have enough), avoid related samples being used as controls for each other as you say, and more generally exclude any sample as a reference if it is overall not well matched to the reference / control set.

The main difficulty is that how to specify it in each case is cnv caller dependent, so it has to be done every time for each one in a different way. In some cases it boils down to executing the whole CNV calling multiple times, which might add quite a lot to the runtime and complexity of it all.

So unfortunately no support yet, but I'm very interested in any ideas you have about how it can best be achieved!