mskilab-org / dryclean

Irons out wrinkles in noisy coverage data using robust PCA
12 stars 14 forks source link

Issue with pon.binsize #31

Closed hraeder41 closed 7 months ago

hraeder41 commented 7 months ago

Hello,

I am attempting to use Dryclean with the PON provided at the bottom of the Readme, however I am receiving the below error:

`(Let's dryclean the genomes!)

Loading PON... PON loaded Loading coverage Loading PON a.k.a detergent Error in if (tumor.binsize != pon.binsize & testing == FALSE) { : argument is of length zero Calls: 2: (function () traceback(2))() 1: dryclean_object$clean(cov = opt$input, center = opt$center, cbs = opt$cbs, cnsignif = opt$cnsignif, mc.cores = opt$cores, verbose = TRUE, use.blacklist = opt$blacklist, blacklist_path = opt$blacklist_path, germline.filter = opt$germline.filter, field = opt$field, testing = opt$testing)`

When running dryclean_object$clean in R directly, it appears that my coverage file has a 1000bp bin size as expected, but the PON is returning NULL when pon.binsize is set. Do you have any insight into what may be causing this?

sebastian-brylka commented 7 months ago

please try with this pon: https://mskilab-pipeline.s3.amazonaws.com/dryclean/pon/hg19/fixed.detergent.rds

hraeder41 commented 7 months ago

Thank you very much! It looks like I am getting past this check now, however I am now getting the below error:

Loading PON... PON loaded Loading coverage Loading PON a.k.a detergent Let's begin, this is whole exome/genome Median-centering the sample Initializing wash cycle Using the detergent provided to start washing lambdas calculated calculating A and B calculating v and s Error in if (e < 1e-06 || k > maxiter) { : missing value where TRUE/FALSE needed Calls: -> wash_cycle -> apg_project 4: (function () traceback(2))() 3: apg_project(m.vec, U, lambda1, lambda2) 2: wash_cycle(m.vec = m.vec, L.burnin = L.burnin, S.burnin = S.burnin, r = r, U.hat = U.hat, V.hat = V.hat, sigma.hat = sigma.hat) 1: dryclean_object$clean(cov = opt$input, center = opt$center, cbs = opt$cbs, cnsignif = opt$cnsignif, mc.cores = opt$cores, verbose = TRUE, use.blacklist = opt$blacklist, blacklist_path = opt$blacklist_path, germline.filter = opt$germline.filter, field = opt$field, testing = opt$testing)

Is this also related to the PON, or is it more likely an issue with my coverage input?

sebastian-brylka commented 7 months ago

It is most likely the coverage input issue, are you using a mask?

hraeder41 commented 7 months ago

I am not, I am using the direct "cov.rds" output from FragCounter. Also, one thing to note is that this is whole-exome data rather than whole-genome. Would that be a potential cause of the issue?

sebastian-brylka commented 7 months ago

yes, that is probably the cause, can you try setting 'wgs' parameter to FALSE when you initialize the PON object?

hraeder41 commented 7 months ago

I just tried this out, and it is working on my personal computer but not on our cluster, so I believe it's like an issue outside of Dryclean itself. Thank you very much for your help!