marbl / canu

A single molecule sequence assembler for genomes large and small.
http://canu.readthedocs.io/
644 stars 177 forks source link

Failed to open meryl.WORKING/0x110101[025].merylData for reading: Too many open files #2311

Open arslan9732 opened 2 months ago

arslan9732 commented 2 months ago

Hi, I am using canu-v2.2 to assemble a plant genome using the following command on SGE system.

canu  -p CANU_NG -d CANU_NG genomeSize=3.5g -nanopore ONT.genome.fastq.gz

It is always failing at meryl step. I looked into the last log file created by canu CANU_NG/correction/0-mercounts/meryl-count.2.out, I found these lines at the end of the file:

Used 3.971 GB / 15.688 GB to store      2093551 kmers; need 9.500 GB to sort     15939060 kmers
Used 4.093 GB / 15.688 GB to store     48154041 kmers; need 9.503 GB to sort     15943966 kmers
Used 4.218 GB / 15.688 GB to store    194715218 kmers; need 9.503 GB to sort     15944117 kmers
Used 4.344 GB / 15.688 GB to store    332898334 kmers; need 9.503 GB to sort     15944117 kmers
Used 4.469 GB / 15.688 GB to store    437579594 kmers; need 9.503 GB to sort     15944117 kmers
Used 4.597 GB / 15.688 GB to store    531799917 kmers; need 9.503 GB to sort     15944117 kmers
Used 4.724 GB / 15.688 GB to store    621827635 kmers; need 9.503 GB to sort     15944117 kmers

Input complete.  Writing results to './CANU_NG.02.meryl.WORKING', using 40 threads.
finishIteration()--
finishIteration()--  Merging 27 blocks.
Failed to open './CANU_NG.02.meryl.WORKING/0x110101[025].merylData' for reading: Too many open files
Failed to open './CANU_NG.02.meryl.WORKING/0x010110[026].merylData' for reading: Too many open files
Failed to open './CANU_NG.02.meryl.WORKING/0x110011[026].merylData' for reading: Too many open files
Failed to open './CANU_NG.02.meryl.WORKING/0x100000[027].merylData' for reading: Too many open files
Failed to open './CANU_NG.02.meryl.WORKING/0x110110[025].merylData' for reading: Too many open files
Failed to open './CANU_NG.02.meryl.WORKING/0x110001[026].merylData' for reading: Too many open files

There are 6 meryl directories in CANU_NG/correction/0-mercounts/ folder:

CANU_NG.01.meryl
CANU_NG.02.meryl.WORKING
CANU_NG.03.meryl
CANU_NG.04.meryl
CANU_NG.05.meryl
CANU_NG.06.meryl

And this error is only for CANU_NG.02.meryl.WORKING, others were successfully completed. There are 3456 files in this folder.

0x000000[001].merylData
0x000000[001].merylIndex
0x000000[002].merylData
0x000000[002].merylIndex
0x000000[003].merylData
0x000000[003].merylIndex
0x000000[004].merylData
0x000000[004].merylIndex
0x000000[005].merylData
0x000000[005].merylIndex
0x000000[006].merylData
....

Could you please let me know, how to solve this error? Thanks

brianwalenz commented 1 month ago

Decreasing the number of threads (merylThreads=16 or 8 or 32 are good values) will reduce the number of files opened at the same time.

Increasing the memory limit (it looks like merylMemory=16 might be used here) will reduce the number of files generated.

I suggest merylThreads=16 merylMemory=40.

Apologies for the delay.