AstraZeneca-NGS / VarDictJava

VarDict Java port
MIT License
127 stars 55 forks source link

JAVA ERROR RUNNING VARDICTJAVA #384

Open espelpz opened 1 year ago

espelpz commented 1 year ago

Hi,

I tried running VarDict to get somatic variants but it was too slow, so I am trying to run VarDictJava. I am not able to make it work, everytime I try to run VarDictJava I get this kind of error message:

WARNING: BAM index file CASO_002_1.bai is older than BAM CASO_002_1.bam

WARNING: BAM index file CASO_002_1.bai is older than BAM CASO_002_1.bam There was Exception while processing position on 55270347 on region chr7:55270300-55270348. The processing will be continued from the next position. java.lang.NumberFormatException: For input string: "38,0" at java.base/jdk.internal.math.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2054) at java.base/jdk.internal.math.FloatingDecimal.parseDouble(FloatingDecimal.java:110) at java.base/java.lang.Double.parseDouble(Double.java:543) at com.astrazeneca.vardict.Utils.roundHalfEven(Utils.java:102) at com.astrazeneca.vardict.modules.ToVarsBuilder.createVariant(ToVarsBuilder.java:442) at com.astrazeneca.vardict.modules.ToVarsBuilder.process(ToVarsBuilder.java:144) at java.base/java.util.concurrent.CompletableFuture.uniApplyNow(CompletableFuture.java:680) at java.base/java.util.concurrent.CompletableFuture.uniApplyStage(CompletableFuture.java:658) at java.base/java.util.concurrent.CompletableFuture.thenApply(CompletableFuture.java:2094) at com.astrazeneca.vardict.modes.AbstractMode.pipeline(AbstractMode.java:51) at com.astrazeneca.vardict.modes.SomaticMode.processBothBamsInPipeline(SomaticMode.java:111) at com.astrazeneca.vardict.modes.SomaticMode.notParallel(SomaticMode.java:49) at com.astrazeneca.vardict.VarDictLauncher.start(VarDict

VarDictJava fails (there were 11 continued exceptions during the run). Critical exception occurs on region: chr7:55270300-55270348, program will be stopped. java.util.concurrent.CompletionException: java.lang.RuntimeException: java.lang.NumberFormatException: For input string: "40,0" at java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:314) at java.base/java.util.concurrent.CompletableFuture.uniApplyNow(CompletableFuture.java:683) at java.base/java.util.concurrent.CompletableFuture.uniApplyStage(CompletableFuture.java:658) at java.base/java.util.concurrent.CompletableFuture.thenApply(CompletableFuture.java:2094) at com.astrazeneca.vardict.modes.AbstractMode.pipeline(AbstractMode.java:51) at com.astrazeneca.vardict.modes.SomaticMode.processBothBamsInPipeline(SomaticMode.java:111) at com.astrazeneca.vardict.modes.SomaticMode.notParallel(SomaticMode.java:49) at com.astrazeneca.vardict.VarDictLauncher.start(VarDictLauncher.java:65) at com.astrazeneca.vardict.Main.main(Main.java:15) Caused by: java.lang.RuntimeException: java.lang.NumberFormatException: For input string: "40,0" at com.astrazeneca.vardict.Utils.printExceptionAndContinue(Utils.java:276) at com.astrazeneca.vardict.modules.ToVarsBuilder.process(ToVarsBuilder.java:164) at java.base/java.util.concurrent.CompletableFuture.uniApplyNow(CompletableFuture.java:680) ... 7 more Caused by: java.lang.NumberFormatException: For input string: "40,0" at java.base/jdk.internal.math.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2054) at java.base/jdk.internal.math.FloatingDecimal.parseDouble(FloatingDecimal.java:110) at java.base/java.lang.Double.parseDouble(Double.java:543) at com.astrazeneca.vardict.Utils.roundHalfEven(Utils.java:102) at com.astrazeneca.vardict.modules.ToVarsBuilder.createVariant(ToVarsBuilder.java:442) at com.astrazeneca.vardict.modules.ToVarsBuilder.process(ToVarsBuilder.java:144) ... 8 more

How could I solve this issue?

I am pasting the command run down below:

/home/espell/Programas/VarDictJava/build/install/VarDict/bin/VarDict -G /home/espell/Proyectos/referencias/GRCh38/GRCh38_aws-igenomes/references/Homo_sapiens/GATK/GRCh38/Sequence/WholeGenomeFasta/Homo_sapiens_assembly38.fasta -f 0.01 -N 1_CASO_002 -b "CASO_002_1.bam|CASO_002_0.bam" -R chr7:55270300-55270348:EGFR

Thank you in advance!

PolinaBevad commented 1 year ago

Hi Espe, It seems that you are running it in environment with different regional settings, i.e. float number representation. We support en-US, with number format as 38.0 for float. Is that right? If so, please switch it to en-US and it should be ok.

espelpz commented 1 year ago

Hi Polina,

Thank you so much for your reply!

Switching the regional settings worked for me, but now, some hours after vardict starts running it stops due to the following error:

Critical exception occurs on region: chr1:2746290-12954384, program will be stopped. java.util.concurrent.CompletionException: java.lang.OutOfMemoryError: GC overhead limit exceeded at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273) at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280) at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:604) at java.util.concurrent.CompletableFuture.uniApplyStage(CompletableFuture.java:614) at java.util.concurrent.CompletableFuture.thenApply(CompletableFuture.java:1983) at com.astrazeneca.vardict.modes.AbstractMode.pipeline(AbstractMode.java:48) at com.astrazeneca.vardict.modes.SomaticMode.processBothBamsInPipeline(SomaticMode.java:111) at com.astrazeneca.vardict.modes.SomaticMode.access$000(SomaticMode.java:31) at com.astrazeneca.vardict.modes.SomaticMode$SomdictWorker.call(SomaticMode.java:96) at com.astrazeneca.vardict.modes.SomaticMode$SomdictWorker.call(SomaticMode.java:79) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded at jregex.Matcher.(Matcher.java:133) at jregex.Pattern.matcher(Pattern.java:217) at com.astrazeneca.vardict.modules.CigarModifier.modifyCigar(CigarModifier.java:61) at com.astrazeneca.vardict.modules.CigarParser.parseCigar(CigarParser.java:276) at com.astrazeneca.vardict.modules.CigarParser.process(CigarParser.java:88) at com.astrazeneca.vardict.modes.AbstractMode$$Lambda$24/755929031.apply(Unknown Source) at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602) ... 11 more

real 123m1.190s user 1220m16.336s sys 5m39.223s

How much memory Vardict needs to perform the complete analysis? I am getting only some variants from chr1 since the analysis stops. I am running Vardict in a HPC and I am specifying the following parameters:

Number of desired cpus: SBATCH --cpus-per-task=10

Amount of RAM needed for this job: SBATCH --mem=800gb

The time the job will be running: SBATCH --time=100:00:00

Than you again for your help!

Espe

PolinaBevad commented 1 year ago

Amount of memory needed depends on sample (BAM files can be really huge, or you can put big regions to analyse) but as you can increase amount of memory for Java process the way described there: https://github.com/AstraZeneca-NGS/VarDictJava/wiki/Error-java.lang.OutOfMemoryError:-GC-overhead-limit-exceeded
Can you please try this?