refresh-bio / KMC

Fast and frugal disk based k-mer counter
266 stars 73 forks source link

[ISSUE] KMC stopped at stage1 #158

Open Sherry520 opened 3 years ago

Sherry520 commented 3 years ago
  1. Issue: when I run kmc ,it increase to 100%,and then stopped
  2. command line is kmc -k55 -ci1 -m1000 -t120 -sr120 -fbam /data/map_res/trimmed-SK/P2.sorted.bam ./00-kmc-res/P2.sorted.bam ./kmc-tmp/ I have set such a huge memory and threads to run,but in fact, it only used 23G and 1 thread
  3. the log massage is:

    Stage 1: 100%

  4. in the temp dir, all bin file is 0 image

I don't know how to deal???

janawold1 commented 3 years ago

I'm having a similar issue. I was able to get it to 100% of stage 1 by increasing the memory requirements to 200Gb and 32 threads. However, I think it still crashes as nothing else is printed to screen and the written files are empty.

marekkokot commented 3 years ago

Hi,

thanks for reporting this issue. Wow, 1TB of RAM,, I have never tested KMC in such circumstances. Nevertheless, could you please share your input data that I could try to reproduce this issue? Have you tried with a lower number of threads?

Sherry520 commented 3 years ago

@marekkokot Why the data stores in RAM in stage 1 ? And when will the temp file used? There is another problem, I run the data at the first time , it could run normally, but when i re-run the same data again, it only run at stage 01 and stopped at 100%. My network doesn't work well,so I can't upload my data to you.

Sherry520 commented 3 years ago

I solved the problems , I run command line : "kmc -k55 -ci1 -m50 -r -fbam /data/map_res/trimmed-SK/P2.sorted.bam ./00-kmc-res/P2.sorted.bam ./kmc-tmp/", it works. But I can't use" -t -sr and -m" to control computing resource ,or it would stop . Without the limit, It used all the available CPU resource, that's bad news for other people to use the same Linux Service.

marekkokot commented 3 years ago

@Sherry520 At the first stage, RAM is used for internal buffers. Those buffers are flushed when full. Its size is determined by the memory limit specified with -m, so it is possible that the data will be flushed when all input data has been read (in fact is such a case flushing to temp files does not make sense, but in most cases the total amount of memory is much smaller than 1TB, furthermore there is also -r parameter that replaces disk files with RAM usage).

If you cannot upload your data maybe you know any other publicly available data that causes the issue.

My suspicion is that the bug is related to high values of -m and/or -t, so you should be still able to use -t giving it relatively small values.

Of course, I would really like to fix the issue, but it may be hard without the input data causing it.

Sherry520 commented 3 years ago

Thanks for your reply. I will try other values to get the best parameters.

shokrof commented 2 years ago

My guess the probelm is counting from bams not Fastq with high number of threads I am being experinece this problem with "-t16 -m 16 -fbam", KMC doesn't get stuck all the time. I noticed that the OS just hangs the process for an unknown reason, it is not an infinite loop thing. I may worked around the problem by using "-t4 -m 16 -sm".

I was using the same command with "-t16 -m 16 -fq" with no problem

I hope this will shed some lights on the possible bug, Also I am wondering If you can suggest the largest value to use with -t.

I faced problem with the bam extracted from this file ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/HGDP/data/Uygur/HGDP01302/alignment/HGDP01302.alt_bwamem_GRCh38DH.20181023.Uygur.cram

marekkokot commented 2 years ago

Hi, thanks! It may be helpful. What reference file have you used to extract bam from this cram? It seems I have used the wrong one because samtools reports MD5 mismatch

shokrof commented 2 years ago

ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/GRCh38_reference_genome/GRCh38_full_analysis_set_plus_decoy_hla.fa

Philip-Murzynowski commented 1 year ago

Hello! I wanted to check if there have been any fixes for this issue. I currently use a quick workaround to timeout after at most a couple minutes, rerun with -t2 (which often works), and then finally rerun with -t1 which always succeeds. I use KMC 3.2.1