adigenova / wengan

An accurate and ultra-fast hybrid genome assembler
GNU Affero General Public License v3.0
84 stars 14 forks source link

core dump when using docker #51

Closed yipukangda closed 3 years ago

yipukangda commented 3 years ago

Hi, I run wengan via docker with command:

docker run -it -v $PWD:/data adigenova/wengan:v0.2 \
 perl /wengan/wengan-v0.2-bin-Linux/wengan.pl \
 -x pacraw \
 -a M \
 -s /data/r1.fq.gz,/data/r2.fq.gz \
 -l /data/pacbio.fastq.gz \
 -p data/asm_wengan -t 20 -g 3000

then meet the following error from file asm_wengan.fml.err:

LOG: Mapping mode =pacraw H=1 k=20 w=5 L=2000 l=250 q=40 m=150 c=65 r=300 t=20 o=/data/asm_wengan I=500,1000,2000,3000,4000,5000,6000,7000,8000,10000,15000,20000 s=1
Building contig index
[M::mm_idx_gen::1625562708.768*0.00] collected minimizers
[M::mm_idx_gen::1625562708.815*0.00] sorted minimizers
[M::mm_mapopt_update::1625562708.842*0.00] mid_occ = 7
[M::mm_idx_stat] kmer size: 20; skip: 5; is_hpc: 1; #seq: 126
[M::mm_idx_stat::1625562708.859*0.00] distinct minimizers: 1220735 (99.52% are singletons); average occurrences: 1.008; average spacing: 4.077
Index construction time: 0.901258 seconds for 126 target sequence(s)
Segmentation fault (core dumped)

could you tell me what's going wrong.

Thanks.

adigenova commented 3 years ago

Hi, The error message is not very informative, but I have seen some core dump at this stage with corrupted fastq files. thus, can you check if the long-reads are in fastq or fasta format?

best, Alex

yipukangda commented 3 years ago

@adigenova Hi, I have checked long reads and NGS fq file format with some tools and without finding any problem, the sequence file is from a microbe sample, dose wegan only suit for large genome assembly.

Thanks.

adigenova commented 3 years ago

Hi @yipukangda, Wengan can work with bacterial genomes, can you share your data or a subsample of it to reproduce the error on my side? These memory segfault errors are hard to reproduce. Another alternative is to try the WenganA mode and see how it goes.

Best, Alex

adigenova commented 3 years ago

Feel free to reopen if you can share some data to replicate the issue. thanks!