ComparativeGenomicsToolkit / cactus

Official home of genome aligner based upon notion of Cactus graphs
Other
502 stars 110 forks source link

Long reads mapping - tutorial #1470

Open leone93 opened 2 weeks ago

leone93 commented 2 weeks ago

Hey guys, thanks for the software. I was following the tutorial for mapping long reads and calling SV. All good for the first part, but I noticed that when I was launching the vg pack using the .gbz original file, vg was crashing all the time; instead if I use the gbz.gfa (not compressed) created to do the mapping with graph aligner vg pack was working smoothly. Then I was trying to do the vg call part, using one time the original .gbz and then the .gbz.gba but in both case vg crashes with this report: `━━━━━━━━━ Crash report for vg ━━━ Crash report for vg v1.59.0 "Casatico" ━━━━━━━━, in ━━━━━━━━━━━━━━━━━━━━ Crash report for vg ━v1.59.0 "Casatico"━━━━━━━━━━━━━━ Crash report for vg 0x5fa6b:

0x18 Object "", at 0xffffffffffffffff, in

v1.59.0 "Casatico" in thread 0x5fa33: ━━━━: Stack trace (most recent call last) in thread 0x5fa71: ━0x11 Object "", at ━━━━━━━━━━━━━━━━━

0x25 Object "", at ━━━━━━━ in thread 0x5fa6b:

0x1a Object "", at 0xffffffffffffffff, in

━━━━━━━━━━━━━━━━━━━━

Crash report for vg ━━━━vg: /public/home/anovak/build/vg/include/sdsl/vlc_vector.hpp:172: sdsl::vlc_vector<t_coder, t_dens, t_width>::value_type sdsl::vlc_vector<t_coder, t_dens, t_width>::operator[](sdsl::vlc_vector<t_coder, t_dens, t_width>::size_type) const [with t_coder = sdsl::coder::elias_delta; unsigned int t_dens = 128; unsigned char t_width = 0; sdsl::vlc_vector<t_coder, t_dens, t_width>::value_type = long unsigned int; sdsl::vlc_vector<t_coder, t_dens, t_width>::size_type = long unsigned int]: Assertion `i < m_size' failed. ", at 0xffffffffffffffff, in ━━━ Crash report for vg ━━━━━━━━━━━━━ ━━━━Stack trace (most recent call last)0x21e91bf, in ━━━━━━━━━━━━━━━━━━━━━━━━━━ v1.59.0 "Casatico" ━━━━━━━━━━━Stack trace (most recent call last) in thread 0x5fa6d: v1.59.0 "Casatico" Stack trace (most recent call last) in thread 0x5fa53:

0x25 Object "", at 0, in

0 Object "", at 0, in

0 Object "", at 0, in

0 Object "", at 0, in

━vg: /public/home/anovak/build/vg/include/sdsl/vlc_vector.hpp:172: sdsl::vlc_vector<t_coder, t_dens, t_width>::value_type sdsl::vlc_vector<t_coder, t_dens, t_width>::operator[](sdsl::vlc_vector<t_coder, t_dens, t_width>::size_type) const [with t_coder = sdsl::coder::elias_delta; unsigned int t_dens = 128; unsigned char t_width = 0; sdsl::vlc_vector<t_coder, t_dens, t_width>::value_type = long unsigned int; sdsl::vlc_vector<t_coder, t_dens, t_width>::size_type = long unsigned int]: Assertion `i < m_size' failed.

0xffffffffffffffff, in Object "━━━━━━0xffffffffffffffff, in in thread 0x5fa6b:

0x1a Object "", at 0xffffffffffffffff━━0xffffffffffffffff, in

━━━━v1.59.0 "Casatico" ━━━━━━━, in ━ ━━ in thread 0x5fa56: ━━━━━━━━vg: /public/home/anovak/build/vg/include/sdsl/vlc_vector.hpp:172: sdsl::vlc_vector<t_coder, t_dens, t_width>::value_type sdsl::vlc_vector<t_coder, t_dens, t_width>::operator[](sdsl::vlc_vector<t_coder, t_dens, t_width>::size_type) const [with t_coder = sdsl::coder::elias_delta; unsigned int t_dens = 128; unsigned char t_width = 0; sdsl::vlc_vector<t_coder, t_dens, t_width>::value_type = long unsigned int; sdsl::vlc_vector<t_coder, t_dens, t_width>::size_type = long unsigned int]: Assertion `i < m_size' failed. : # Crash report for vg 0x5fa6bv1.59.0 "Casatico", in

0x5fa6b:

0x1a Object "", at

━━ Crash report for vg Stack trace (most recent call last) in thread 0x5fa85:

━━#0x1a━Stack trace (most recent call last)━━━━━━━━━━━━━━━━━━━━━━━

Crash report for vg v1.59.0 "Casatico" ━━━", at 0xffffffffffffffff, in , in ━━━━━━━━v1.59.0 "Casatico" ━━━━━━━━━━━━vg: /public/home/anovak/build/vg/include/sdsl/vlc_vector.hpp:172: sdsl::vlc_vector<t_coder, t_dens, t_width>::value_type sdsl::vlc_vector<t_coder, t_dens, t_width>::operator[](sdsl::vlc_vector<t_coder, t_dens, t_width>::size_type) const [with t_coder = sdsl::coder::elias_delta; unsigned int t_dens = 128; unsigned char t_width = 0; sdsl::vlc_vector<t_coder, t_dens, t_width>::value_type = long unsigned int; sdsl::vlc_vector<t_coder, t_dens, t_width>::size_type = long unsigned int]: Assertion `i < m_size' failed. ━━━━━━━━━#0x1a Object "━━━0x1a Object "", at 0xffffffffffffffff, in ━━Crash report for vg v1.59.0 "Casatico" ━━━━━ ━: ━━━━━━━━ in thread 0x5fa6c:

0x1a Object "", at 0xffffffffffffffff0x5fa6b:

0x1a━━━━━━━━━━━━━0 Object "", at 0, in

━━ ━━━━━━━Stack trace (most recent call last) in thread Stack trace (most recent call last) in thread 0x5fa7f:

0x1a Object "", at 0xffffffffffffffff, in

0, in Object "", at 0xffffffffffffffff, in

, in ━ Crash report for vg 0x1a Object "", at ━━━━━━━━━━━━` The command I used for this part is:

for i in "$FQDIR"/* ; do
    name=$(basename "$i" .fastq.gz)
    [ -f "$ALIGN_DIR"/"$name".magic16pangenome.pack ] || vg pack -x "$DATADIR"/pangenome/magic16_rice_pangenome/magic16_rice_pangenome.gbz.gfa -Q 0 -g "$name".magic16pangenome.gam -o "$name".magic16pangenome.pack
    while IFS= read -r reference; do
        [ -f "$ALIGN_DIR"/"$name"."$reference".call.vcf.gz ] || vg call "$DATADIR"/pangenome/magic16_rice_pangenome/magic16_rice_pangenome.gbz -r "$DATADIR"/pangenome/magic16_rice_pangenome/magic16_rice_pangenome.snarls -k "$name".magic16pangenome.pack -s "$name" -S "$reference" -az | bgzip >  "$name"."$reference".call.vcf.gz
    done < "$ALIGN_DIR"/reference.txt
done

Do you have any idea?

leone93 commented 2 weeks ago

update: using the gbz.gba and recomputing the .snarls file on the this one seems working