tlemane / kmtricks

modular k-mer count matrix and Bloom filter construction for large read collections
GNU Affero General Public License v3.0
72 stars 7 forks source link

kmtricks crash at merge #15

Closed MorillonLab closed 10 months ago

MorillonLab commented 2 years ago

Hello,

i installed kmtricks with conda, and i tried to run it on about 10000 fastq files, stored on an external drive.

here is the command line used : kmtricks pipeline --file list_fastq_kmtricks --run-dir kmtricksDir --kmer-size 31 --hard-min 5 --mode kmer:count:bin --recurrence-min 10 -t 12

and here are the message obtained from kmtricks :

[2022-04-16 20:17:08.096] [info] Run with Kmer<32> - uint64_t implementation [2022-04-16 20:17:08.320] [info] Compute configuration... [2022-04-16 20:17:08.320] [info] 3504 samples found (10512 read files). [2022-04-16 20:51:29.192] [info] Use 113 partitions. [2022-04-16 20:51:29.287] [info] Compute minimizer repartition... Compute SuperK [==================================================] [02d:11h:28m:38s]
Count partitions [==================================================] [02d:11h:28m:38s]
Merge partitions [> ] [00:00s]
terminate called after throwing an instance of 'std::runtime_error' terminate called after throwing an instance of 'std::runtime_error' terminate called recursively [2022-04-19 08:34:53.972] [error] Killed after receive Aborted:SIGABRT(6) signal. Demangled backtrace dumped at ./kmtricks_backtrace.log. If the problem persists, please open an issue with the return of 'kmtricks infos' and the content of ./kmtricks_backtrace.log what(): Unable to open /media/ugo/Transcend3/scRNAseq_kmer/EMTAB_9067/kmtricks/counts/partition_1/ERR4147809.kmer what(): Unable to open /media/ugo/Transcend3/scRNAseq_kmer/EMTAB_9067/kmtricks/counts/partition_10/ERR4147809.kmer

[2022-04-19 08:34:53.990] [error] Killed after receive Aborted:SIGABRT(6) signal. Demangled backtrace dumped at ./kmtricks_backtrace.log. If the problem persists, please open an issue with the return of 'kmtricks infos' and the content of ./kmtricks_backtrace.log [2022-04-19 08:34:53.990] [error] Killed after receive Aborted:SIGABRT(6) signal. Demangled backtrace dumped at ./kmtricks_backtrace.log. If the problem persists, please open an issue with the return of 'kmtricks infos' and the content of ./kmtricks_backtrace.log

i was not able to find the ./kmtricks_backtrace.log file i checked the file /media/ugo/Transcend3/scRNAseq_kmer/EMTAB_9067/kmtricks/counts/partition_1/ERR4147809.kmer, and it exists.

here is kmtricks infos: kmtricks v1.2.1

Thanks for your help

Ugo

tlemane commented 2 years ago

Hello,

Sorry for this issue. It seems that the maximum number of simultaneously opened files has been reached on your system. Since you have a large number of samples, kmtricks needs to open a lot of files during the merge.

You can try to increase this limit (see ulimit) and/or reduce the number of threads.

To avoid recomputing the first steps during your tests, I suggest to split the pipeline as follows:

1/ Count

kmtricks pipeline --file list_fastq_kmtricks --run-dir kmtricksDir --kmer-size 31 --hard-min 5 --mode kmer:count:bin --until count --cpr -t 12

2/ Merge

kmtricks merge --run-dir kmtricksDir --recurrence-min 10 --mode kmer:count:bin --cpr -t <nb_threads>

This way you can do several merge tries without recounting the partitions.

I hope this help.

Téo

MorillonLab commented 2 years ago

thanks a lot, i'll try it

Ugo


De : Téo Lemane @.***> Envoyé : mardi 26 avril 2022 14:13:52 À : tlemane/kmtricks Cc : Szachnowski Ugo; Author Objet : Re: [tlemane/kmtricks] kmtricks crash at merge (Issue #15)

Hello,

Sorry for this issue. It seems that the maximum number of simultaneously opened files has been reached on your system. Since you have a large number of samples, kmtricks needs to open a lot of files during the merge.

You can try to increase this limit (see ulimithttps://docs.oracle.com/cd/E19683-01/816-0210/6m6nb7mo3/index.html) and/or reduce the number of threads.

To avoid recomputing the first steps during your tests, I suggest to split the pipeline as follows:

1/ Count

kmtricks pipeline --file list_fastq_kmtricks --run-dir kmtricksDir --kmer-size 31 --hard-min 5 --mode kmer:count:bin --until count --cpr -t 12

2/ Merge

kmtricks merge --run-dir kmtricksDir --recurrence-min 10 --mode kmer:count:bin --cpr -t

This way you can do several merge tries without recounting the partitions.

I hope this help.

Téo

— Reply to this email directly, view it on GitHubhttps://github.com/tlemane/kmtricks/issues/15#issuecomment-1109720553, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AM6YMAFIO3A2VPLGC62R4UTVG7MYBANCNFSM5ULHM5SA. You are receiving this because you authored the thread.Message ID: @.***>