lindenb / jvarkit

Java utilities for Bioinformatics
https://jvarkit.readthedocs.io/
Other
482 stars 133 forks source link

sortsamrefname fails with "java.nio.file.FileSystemException: ... Too many open files" #128

Closed Martingales closed 4 years ago

Martingales commented 5 years ago

Verify

Subject of the issue

When I run sortsamrefname it fails when trying to merge the tmp files in the end of the run with the following error message:

INFO][SortSamRefName] Count:1758750002 Elapsed: 2 hours Speed: 181.60818 record/millisec.
INFO    2019-06-11 19:17:42     SortingCollection       Creating merging iterator from 11740 files
[SEVERE][SortSamRefName]java.nio.file.FileSystemException: /homes/drews01/hps2/bams_limcov/tmp/sortingcollection.655856085552212076.tmp: Too many open files
htsjdk.samtools.util.RuntimeIOException: java.nio.file.FileSystemException: /homes/drews01/hps2/bams_limcov/tmp/sortingcollection.655856085552212076.tmp: Too many open files
        at htsjdk.samtools.util.SortingCollection$FileRecordIterator.<init>(SortingCollection.java:601)
        at htsjdk.samtools.util.SortingCollection$MergingIterator.<init>(SortingCollection.java:512)
        at htsjdk.samtools.util.SortingCollection.iterator(SortingCollection.java:297)
        at com.github.lindenb.jvarkit.tools.misc.SortSamRefName.doWork(SortSamRefName.java:164)
        at com.github.lindenb.jvarkit.util.jcommander.Launcher.instanceMain(Launcher.java:756)
        at com.github.lindenb.jvarkit.util.jcommander.Launcher.instanceMainWithExit(Launcher.java:919)
        at com.github.lindenb.jvarkit.tools.misc.SortSamRefName.main(SortSamRefName.java:190)
Caused by: java.nio.file.FileSystemException: /homes/drews01/hps2/bams_limcov/tmp/sortingcollection.655856085552212076.tmp: Too many open files
        at sun.nio.fs.UnixException.translateToIOException(UnixException.java:91)
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
        at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
        at java.nio.file.Files.newByteChannel(Files.java:361)
        at java.nio.file.Files.newByteChannel(Files.java:407)
        at java.nio.file.spi.FileSystemProvider.newInputStream(FileSystemProvider.java:384)
        at java.nio.file.Files.newInputStream(Files.java:152)
        at htsjdk.samtools.util.SortingCollection$FileRecordIterator.<init>(SortingCollection.java:596)
        ... 6 more
[INFO][Launcher]sortsamrefname Exited with failure (-1)
[INFO][Biostar154220]. Completed. N=0. That took:0 second

Your environment

Steps to reproduce

.

Expected behaviour

I'd like to cap the read coverage as described here: https://www.biostars.org/p/154220/

Actual behaviour

sortrefname fails.

lindenb commented 5 years ago

the error is here:

Too many open files

your file is too big for the external sorting https://en.wikipedia.org/wiki/External_sorting

try to increase your java memory with something like java -Xmx5g ... and increase the number of records in RAM with --maxRecordsInRam in http://lindenb.github.io/jvarkit/SortSamRefName.html

Martingales commented 5 years ago

That was quick. I'll try extending mem and reads. Keep you updated.