ablab / spades

SPAdes Genome Assembler
http://ablab.github.io/spades/
Other
762 stars 138 forks source link

Failed to determine erroneous kmer threshold. Threshold set to: 22 #460

Closed bioramg closed 4 years ago

bioramg commented 4 years ago

Hi, Thank you for developing SPAdes for hybrid assembly.

I tried to assemble Illumina pair-end with Oxford Nanopore Sequence for plant mitochondrial genome assembly. Before carried out, I used BBMAP to trim adapters and normalize Illumina data. I have used the commands for trim adapters: bbmap$ ./bbduk.sh -Xmx1g in=read1.fastq in2=read2.fastq out=trim_read1.fastq out2=trim_read2.fastq ktrim=r k=23 mink=11 hdist=1 ref=resources/adapters.fa tbo tpe

for normalize: $./bbnorm.sh in=trim_read1.fastq in2=trim_read2.fastq out=norm_read1.fastq out2=norm_rread2.fastq target=100 min=5

Then, I used these Illumina pair-end reads with Oxford nanopore and the commands as follows: $ ./spades.py -k 99 --pe1-1 norm_read1.fastq --pe1-2 norm_read2.fastq --nanopore nano_read.fastq --careful --cov-cutoff 90 -o hybrid_assembly -m 188.

When I run using this command, I have got an error message as: ======= SPAdes pipeline finished WITH WARNINGS!

=== Error correction and assembling warnings:

I am herewith enclosing spades.log and warning.log files for your reference. Please find and suggest me what could be the problem and how to solve it...

spades.log warnings.log

Thank you. Raman. G

asl commented 4 years ago

The warning indicates the uneven coverage of the input data. The most likely reason for this is the "normalization". Also, using just single k-mer length might yield suboptimal results.

bioramg commented 4 years ago

Thank you for your kind reply. So, what are the steps needed to be carried out? I already tried different k-mer lengths. I got results of kmer 21 and 33 and failed to assemble kmer 55 onwards. Could you please suggest me to solve this issue?

asl commented 4 years ago

You may want to get rid of normalization. And start from the default set of k-mer length.

bioramg commented 4 years ago

If I use without bbmap, spades running out of memory. To reduce out of memory, I used bbmap. I would like to know what is the default set of k-mer length.

asl commented 4 years ago

Just do not specify the list of k-mer lengths.

bioramg commented 4 years ago

Thank you. I run the program without k-mer length. But it's failed after k-mer 33.

asl commented 4 years ago

And what was the error message?

bioramg commented 4 years ago

=== Error correction and assembling warnings:

asl commented 4 years ago

So, it's not a failure. These are warnings indicating that the assembly results might be suboptimal

bioramg commented 4 years ago

== Error == system call for: "['/home/pmslab/Desktop/Raman/bin/bin/spades-core', '/home/pmslab/Desktop/Raman/Convallaria_mt/keiskei_hybrid_assembly/K55/configs/config.info', '/home/pmslab/Desktop/Raman/Convallaria_mt/keiskei_hybrid_assembly/K55/configs/careful_mode.inf$ ======= SPAdes pipeline finished abnormally and WITH WARNINGS!

asl commented 4 years ago

Please provide the full spades.log as there is no error message nor the relevant information there

bioramg commented 4 years ago

I have got this message. I am struggling to assemble this genome for past three months. I dont know how to troubleshoot. Thank you so much for your kind cooperation. spades.log

asl commented 4 years ago

You're likely running out of RAM. Worth giving SPAdes 3.14 a try

bioramg commented 4 years ago

ok thank you. Shall i run without using bbmap?

asl commented 4 years ago

You may want to skip the normalization part, yes

bioramg commented 4 years ago

Thank you so much for your kind reply.

bioramg commented 4 years ago

Dear sir, As you suggested, I run spades after using bbmap trimming command. My server is maximum of 188 GB RAM only. How can I run this assembly?

I received error messageg like this:

2:05:04.864 5G / 89G INFO General (kmer_index_builder.hpp : 150) Merging final buckets. 2:16:20.748 5G / 89G INFO K-mer Index Building (kmer_index_builder.hpp : 336) Index built. Total 4941279872 bytes occupied (3.70963 bits per kmer). 2:16:21.969 5G / 89G ERROR K-mer Counting (kmer_data.cpp : 348) The reads contain too many k-mers to fit into available memory. You need approx. 396.971GB of free RAM to assemble your dataset

== Error == system call for: "['/home/pmslab/Desktop/Raman/bin/SPAdes-3.14.0-Linux/bin/spades-hammer', '/home/pmslab/Desktop/Raman/Convallaria_mt/keis_hy_asm_3_2_2020/corrected/configs/config.info']" finished abnormally, OS return value: 255

asl commented 4 years ago

Well, you're really out of RAM. You may want to perform heavy quality trimming / normalization with the understanding that results might be suboptimal.

asl commented 2 years ago

@AkmalRana Please do not hijack old closed issues. In your case log clearly reads:

0:14:00.470 320M / 616M ERROR K-mer Counting (kmer_data.cpp : 351) The reads contain too many k-mers to fit into available memory. You need approx. 22.7224GB of free RAM to assemble your dataset
AkmalRana commented 2 years ago

@asl Sorry for this trouble.

asmlgkj commented 2 years ago

@asl Thanks a lot. I met the same error, mine sample is ctdna human sample, the error is === Error correction and assembling warnings:

asl commented 2 years ago

@asmlgkj As it was written above – please do not hijack old issues. Open new

asmlgkj commented 2 years ago

@asl sorry , I failed to undertand the word hijack

luu520 commented 1 week ago

So, when I encounter the following warnings, can I ignore these warnings and proceed with the next step of analysis using the assembly results? Thanks for your help.

=== Error correction and assembling warnings:

  • 1:47:47.483 82G / 113G WARN General (kmer_coverage_model.cpp : 327) Valley value was estimated improperly, reset to 57
  • 1:47:47.494 82G / 113G WARN General (kmer_coverage_model.cpp : 366) Failed to determine erroneous kmer threshold. Threshold set to: 57
  • 2:34:50.652 81G / 112G WARN General (kmer_coverage_model.cpp : 327) Valley value was estimated improperly, reset to 16
  • 2:34:50.659 81G / 112G WARN General (kmer_coverage_model.cpp : 366) Failed to determine erroneous kmer threshold. Threshold set to: 16
  • 3:43:21.155 68G / 97G WARN General (kmer_coverage_model.cpp : 218) Too many erroneous kmers, the estimates might be unreliable
  • 3:43:39.815 68G / 97G WARN General (kmer_coverage_model.cpp : 327) Valley value was estimated improperly, reset to 8 ======= Warnings saved to /home/pmslab/Desktop/Raman/Convallaria_mt/keiskei_hybrid_assembly/warnings.log