maickrau / GraphAligner

MIT License
255 stars 30 forks source link

Core dump when building minimizer seeder from the graph #29

Open PhoebeWangintw opened 3 years ago

PhoebeWangintw commented 3 years ago

I tried to align whole genome reads (hg19) generated from pbsim using GraphAligner. However, the program crashed when building minimizer seeder from the graph. This is the error message that I got:

GraphAligner Branch master commit 66008ea18f9114ac6b51990bfb38d2eb8ccb9d13 2021-01-12 15:42:17 -0500
GraphAligner Branch master commit 66008ea18f9114ac6b51990bfb38d2eb8ccb9d13 2021-01-12 15:42:17 -0500
Load graph from whole_genome_graph_aligner.vg
Build alignment graph
196072668 original nodes
196072668 split nodes
14991502 ambiguous split nodes
196072482 edges
0 nodes with in-degree >= 2
Build minimizer seeder from the graph
GraphAligner: /root/miniconda3/envs/GraphAligner/include/sdsl/int_vector.hpp:1351: sdsl::int_vector<<anonymous> >::reference sdsl::int_vector<<anonymous> >::operator[](const size_type&) [with unsigned char t_width = 0; sdsl::int_vector<<anonymous> >::reference = sdsl::int_vector_reference<sdsl::int_vector<0> >; sdsl::int_vector<<anonymous> >::size_type = long unsigned int]: Assrtion `idx < this->size()' failed.

I also found that the program crashed at line 462 in MinimizerSeeder.cpp.

buckets[thread].startPos[index] += 1;

I tried printing out the "index" value, and found that index is an extremely large value. So I think maybe there's something wrong with the lookup function at line 461.

size_t index = buckets[thread].locator->lookup(kmer);

This is how I built the index using vg with version v1.29.0-41-g9393db95c "Sospiro".

vg construct -r hg19.fa > whole_genome_graph_aligner.vg

And also how I ran GraphAligner.

GraphAligner -g whole_genome_graph_aligner.vg -f whole_genome_reads.fq -a whole_genome_graph_aligner.gam -t 16 -x vg

Is there something wrong with the commands that I used to run the program? Any help would be appreciated. Thank you!

maickrau commented 3 years ago

Hi, the commands for vg and GraphAligner look perfectly fine. The same issue has happened before in some machines. It seems to be related to the perfect hashing used in the minimizer index. There's no resolution yet but you can try running in a different machine.

jmonlong commented 3 years ago

Hi, I'm having the same issue on different graphs (one made by minigraph or and one by cactus).

When you say different machines, do you think different docker images could work? Or is it related to the machine running the GraphAligner container that I'm running?

FYI, I was using the biocontainers images when I got the error.

Thanks for your help

jmonlong commented 3 years ago

Just an update: I've tried different ways of making the docker container but am still getting the same bug.

I tried building GraphAligner from source; images that start from Ubuntu 18.04 and 20.04; building the docker container on the same machine where I run it (instead of building locally and pushing to quay.io).

Anything else I could try?

subwaystation commented 3 years ago

Are there any updates here @maickrau ?

lingliao commented 1 year ago

I met the same issue, any suggestion about how to solve the problem will be appreciated!!

Mirkocoggi commented 1 year ago

I have the same problem. Is there some news?

maickrau commented 1 year ago

Has this occurred only when using docker or also without docker?

jzhang-dev commented 1 month ago

I got the same issue when using GraphAligner installed using conda (within a Docker container). Detailed message below:

GraphAligner bioconda 1.0.19-
GraphAligner bioconda 1.0.19-
Load graph from data/bcalm/HG002_MGISEQ/k=51/reference.dbg.gfa
Build alignment graph
Build minimizer seeder from the graph
GraphAligner: /xxx/.snakemake/conda/fadfd70e68957df3eb7f3b74a04d8124_/include/sdsl/int_vector.hpp:1351: sdsl::int_vector<<anonymous> >::reference sdsl::int_vector<<anonymous> >::operator[](const size_type&) [with unsigned char t_width = 0; reference = sdsl::int_vector_reference<sdsl::int_vector<0> >; size_type = long unsigned int]: Assertion `idx < this->size()' failed.

The command I used is:

GraphAligner \
          -g {input.gfa} \
          --corrected-out {output.fasta} \
          --corrected-clipped-out {output.clipped_fasta} \
          -a {output.gaf} \
          -f {input.reads} \
          -t 30 \
          -x dbg \
          -C 500000

This seems also related to #27 .

Any help would be appreciated.