Closed drmrgd closed 3 years ago
Hmm, so this is failing now with all versions, all segmentation functions and all samples that previously worked? That is strange... Running it without parallelization works?
Yeah, you got it right. I haven't tried without parallelization (I'll queue that up right now), but I sort of doubt that's it. I'm betting some other function that PureCN calls has been updated and is not happy with some string that's being passed to it as part of the logging process or something similar (looks like the standard printf
kind of complaint when a string is passed to a float format directive. I showed the PSCBS example above, but even with CBS, I get the same error, but with a preceding log line:
INFO [2021-09-22 15:02:01] Interval weights found, will use weighted CBS.
INFO [2021-09-22 15:02:04] Loading pre-computed boundaries for DNAcopy...
Error in (function (fmt, ...) :
invalid format '%f'; use format %s for character objects
Calls: runAbsoluteCN ... flog.info -> .log_level -> layout -> do.call -> <Anonymous>
In addition: Warning message:
In .bcfHeaderAsSimpleList(header) :
duplicate keys in header will be forced to unique rownames
Execution halted
Looks like undo.SD is parsed as character. That’s the next step with logged output. You see anything wrong from your side?
Yikes! That seems to be the problem. My snakefile somehow had some non-printing chars or something in it that was inputting a bad param to undo.SD. Looks like it's past that part now and chugging along nicely! Thanks for the help! It was hard to figure out from my end what the next call was and what was choking the process up.
Yeah, it will add a check for it. I thought optparse will do that for me, but looks like it only throws a warning,
Regarding oversegmentation, the GATK4 segmentation could be worth a try in your case: higher purity, lots of SNPs, not a lot off-target - pretty much the opposite of what I tuned PSCBS for with our panels and cfDNA.
Thanks Markus! Always happy when the fix is simple. And thanks for the suggestion to check out GATK4 segmentation. I'll have a look to see it it'll improve the output a bit.
Hi Markus, Over the last two days I've been getting a strange error on our cluster:
At first I thought it was related to the latest dev version 1.99.31, to which I upgraded yesterday while trying to optimize some oversegmentation issues I'm having. However, I've tried with v1.23.27, the previous version I was using, which worked OK, version 1.22.2, which is the default version our cluster maintainer has installed, and v1.20.0, which is the default version that out cluster maintainer has installed for R version 4.0.5. Speaking of which, the version of R under which I was running PureCN v1.23.27 and v1.99.31 was R v4.1.0.
This is failing right before the call to PSCBS in runAbsoluteCN from the last message to the log file I think, but I can't quite figure out the call and offending line in code. I did attempt to run with CSB segmentation instead, just in case PSCBS was the one sending the message to the logger and causing the crash, but that didn't seem to solve it.
My guess is that there was some collateral package that was upgraded at some point, which has changed the way string formatting is working or something, but it's really hard for me to figure out. In case this serves as some kind of breadcrumb, here is the sessionInfo for the 1.99.31 attempt:
Do you have a suggestion for what's throwing this error and how I might fix it? Thanks in advance!