Closed jdekanter closed 1 year ago
Caused by: java.io.IOException: Broken pipe
Looks like the CollectGridssMetricsAndExtractSVReads
step got killed.
I now gave it 180GB
The memory for that particular process uses the --otherjvmheap
command line parameter, not --jvmheap
. If you're giving your job 180g, then you should set --jvmheap
to 175g and --otherjvmheap
to ~140g. Both need to be less than 180g as the memory sizes applying to these parameters is just the JVM heap size and your job needs memory for things like the JVM stack, (and for --otherjvmheap) running bwa and samtools at the same time.
--jvmheap: size of JVM heap for the high-memory component of assembly and
variant calling. (Default: 30g)
--otherjvmheap: size of JVM heap for everything else. Useful to prevent
java out of memory errors when using large (>4Gb) reference genomes.
Note that some parts of assembly and variant calling use this heap
size. (Default: 4g)
Hi, thank you for the very quick answer.
I have made the adjustments that you suggested, but the job is killed within 7 seconds after the start of the CollectGridssMetricsAndExtractSVReads
step.
Is it really possible that the job runs out of that much memory (and so quickly)? Is there no other possibility than memory that might be the limiting factor? I'm just using the human reference (~3G) and a bam of 130G and 40G.
Thanks for taking the time!
Dear all,
Thank you for this great tool! Recently, when running gridss v2.13.2 as a part of hmftools (https://github.com/hartwigmedical/hmftools), gridss stops in the very beginning at the CollectGridssMetricsAndExtractSVReads step due to an error in samtools sort: "samtools sort: can't open "/dev/stdin": Exec format error" see the full output below. In a previous issue you suggested that it can run out of memory, but the error occurs 4 seconds after starting this step + I now gave it 180GB, so this seems unlikely to me. In addition, you said it could be the samtools version, but I've now tried this with samtools version 1.15.1, 1.16 and 1.17 and they all give the same error.
In a previous pipeline, which ran gridss v2.9.4, we did not have this issue.
Do you have any additional ideas what the problem could be? Thank you for your input. If you need more information, please let me know.
[Thu Feb 23 16:42:46 CET 2023] CollectGridssMetricsAndExtractSVReads MIN_CLIP_LENGTH=5 READ_PAIR_CONCORDANT_PERCENT=0.995 INSERT_SIZE_METRICS=[path] UNMAPPED_READS=false INCLUDE_DUPLICATES=true SV_OUTPUT=/dev/stdout GRIDSS_PROGRAM=[CollectCigarMetrics, CollectMapqMetrics, CollectTagMetrics, CollectIdsvMetrics, ReportThresholdCoverage] THRESHOLD_COVERAGE=50000 INPUT=[to/to/bam] ASSUME_SORTED=true OUTPUT=[/path/to/working/bam] FILE_EXTENSION=null PROGRAM=[CollectInsertSizeMetrics] TMP_DIR=[/path/to/working/dir] COMPRESSION_LEVEL=0 REFERENCE_SEQUENCE=[/path/to/Homo_sapiens_assembly38.fasta] MIN_INDEL_SIZE=1 CLIPPED=true INDELS=true SPLIT=true SINGLE_MAPPED_PAIRED=true DISCORDANT_READ_PAIRS=true STOP_AFTER=0 METRIC_ACCUMULATION_LEVEL=[ALL_READS] INCLUDE_UNPAIRED=false VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json USE_JDK_DEFLATER=false USE_JDK_INFLATER=false [Thu Feb 23 16:42:46 CET 2023] Executing as [user]@n0068.compute.hpc on Linux 3.10.0-1160.81.1.el7.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 18.0.2+9-61; Deflater: Intel; Inflater: Intel; Provider GCS is not available; Picard version: 2.13.2-gridss INFO 2023-02-23 16:42:47 SAMFileWriterFactory Unknown file extension, assuming BAM format when writing file: file:///dev/stdout [E::hts_hopen] Failed to open file /dev/stdin [E::hts_open_format] Failed to open file "/dev/stdin" : Exec format error samtools sort: can't open "/dev/stdin": Exec format error [Thu Feb 23 16:42:51 CET 2023] gridss.CollectGridssMetricsAndExtractSVReads done. Elapsed time: 0.09 minutes. Runtime.totalMemory()=2017460224 Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: Exception when running gridss.cmdline.ByReadNameSinglePassSamProgram$WrappedSinglePassSamProgram at picard.analysis.SinglePassSamProgram.makeItSo(SinglePassSamProgram.java:253) at picard.analysis.SinglePassSamProgram.makeItSo(SinglePassSamProgram.java:134) at picard.analysis.SinglePassSamProgram.makeItSo(SinglePassSamProgram.java:126) at picard.analysis.CollectMultipleMetrics.doWork(CollectMultipleMetrics.java:598) at gridss.analysis.CollectGridssMetrics.doWork(CollectGridssMetrics.java:78) at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:305) at picard.cmdline.CommandLineProgram.instanceMainWithExit(CommandLineProgram.java:196) at gridss.CollectGridssMetricsAndExtractSVReads.main(CollectGridssMetricsAndExtractSVReads.java:56) Caused by: java.lang.RuntimeException: Exception when running gridss.cmdline.ByReadNameSinglePassSamProgram$WrappedSinglePassSamProgram at picard.analysis.SinglePassSamProgram.raiseAsyncException(SinglePassSamProgram.java:282) at picard.analysis.SinglePassSamProgram.asyncAcceptRead(SinglePassSamProgram.java:273) at picard.analysis.SinglePassSamProgram.asyncAcceptReads(SinglePassSamProgram.java:263) at picard.analysis.SinglePassSamProgram.makeItSo(SinglePassSamProgram.java:216) ... 7 more Caused by: htsjdk.samtools.util.RuntimeIOException: Write error; BinaryCodec in writemode; streamed file (filename not available) at htsjdk.samtools.util.BinaryCodec.writeBytes(BinaryCodec.java:222) at htsjdk.samtools.util.BlockCompressedOutputStream.writeGzipBlock(BlockCompressedOutputStream.java:451) at htsjdk.samtools.util.BlockCompressedOutputStream.deflateBlock(BlockCompressedOutputStream.java:415) at htsjdk.samtools.util.BlockCompressedOutputStream.write(BlockCompressedOutputStream.java:305) at htsjdk.samtools.util.BinaryCodec.writeBytes(BinaryCodec.java:220) at htsjdk.samtools.util.BinaryCodec.writeByteBuffer(BinaryCodec.java:188) at htsjdk.samtools.util.BinaryCodec.writeInt(BinaryCodec.java:234) at htsjdk.samtools.BAMRecordCodec.encode(BAMRecordCodec.java:162) at htsjdk.samtools.BAMFileWriter.writeAlignment(BAMFileWriter.java:144) at htsjdk.samtools.SAMFileWriterImpl.addAlignment(SAMFileWriterImpl.java:185) at htsjdk.samtools.AsyncSAMFileWriter.synchronouslyWrite(AsyncSAMFileWriter.java:36) at htsjdk.samtools.AsyncSAMFileWriter.synchronouslyWrite(AsyncSAMFileWriter.java:16) at htsjdk.samtools.util.AbstractAsyncWriter$WriterRunnable.run(AbstractAsyncWriter.java:123) at java.base/java.lang.Thread.run(Thread.java:833) Caused by: java.io.IOException: Broken pipe at java.base/sun.nio.ch.FileDispatcherImpl.write0(Native Method) at java.base/sun.nio.ch.FileDispatcherImpl.write(FileDispatcherImpl.java:62) at java.base/sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:137) at java.base/sun.nio.ch.IOUtil.write(IOUtil.java:102) at java.base/sun.nio.ch.IOUtil.write(IOUtil.java:72) at java.base/sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:288) at java.base/sun.nio.ch.ChannelOutputStream.writeFullyImpl(ChannelOutputStream.java:60) at java.base/sun.nio.ch.ChannelOutputStream.writeFully(ChannelOutputStream.java:82) at java.base/sun.nio.ch.ChannelOutputStream.write(ChannelOutputStream.java:122) at java.base/java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:81) at java.base/java.io.BufferedOutputStream.write(BufferedOutputStream.java:127) at htsjdk.samtools.util.BinaryCodec.writeBytes(BinaryCodec.java:220) ... 13 more