BoevaLab / ONCOCNV

ONCOCNV - a package to detect copy number changes in Targeted Deep Sequencing and Exome-seq data
24 stars 12 forks source link

Usability on large sample sizes #7

Open rsteinfe1 opened 6 years ago

rsteinfe1 commented 6 years ago

Hi,

First of all, thank you for supporting ONCOCNV. I'm currently trying to use ONCOCNV 6.9 on a larger sample size (~1,700 tumor-normal pairs) and I think that this is a bit of a stretch since it took about a week to read in the data alone.

What I'm a bit confused about is that when running processControl.R (from shell script provided) the script printed after about 10 minutes this:

"Warning: you have both male and female samples in the control. We will try to assign sex using read coverage on chrX". I'm a bit confused why this is a warning, since it's a described feature in the paper to determine sex automatically. In case this is not the intended behaviour, we already have the genders for each sample. Is it easy enough to provide a sex vector containing c(0.5, 1) to the script?

After the script is printing the warning mentioned above, it started to allocate 800% CPU and is running for 24 hours straight without printing anything else. I tried to go through the code, but I couldn't find anything that could cause this. Is fastICA() allocating multiple cores when run in C-mode without documentation?

One final question, I tried to find the cause for the unexpected appetite and ran ONCOCNV on a smaller subset and I noticed that in line 95 (processControl.R) you set NUMBEROFPC = ncont-1;

although you set it very explicitly to 3 a few lines earlier. fastICA() is then run with this variable -in my case, NUMBEROFPC=1704. What is the rationale behind this, could this be the reason that ONCOCNV keeps running?

This is not exactly a bug, but it would help me a lot understanding ONCOCNV much better. Thank you, Robert