Open hgibling opened 8 years ago
Can you share the xg index or input graph? I'll take a look.
On Fri, May 27, 2016, 21:36 Heather Gibling notifications@github.com wrote:
vg: path/to/vg/include/sdsl/int_vector.hpp:1396: sdsl::int_vector<
>::const_reference sdsl::int_vector< >::operator const [with unsigned char t_width = 0u; sdsl::int_vector< >::const_reference = const long unsigned int; sdsl::int_vector< > ::size_type = long unsigned int]: Assertion `idx < this->size()' failed. I've gotten this error a couple of times. Most recently using vg find -n 1 -c 5 -x graph.xg: the graph itself isn't huge
nodes 154 edges 151 length 3360
and I've successfully run that command on a much larger graph. The other time was using vg msga, though I don't remember the exact command I used. I'm requesting plenty of memory when I submit the command on my cluster.
Thoughts on what might be going on?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/vgteam/vg/issues/364, or mute the thread https://github.com/notifications/unsubscribe/AAI4EWttHXh6ivFG6K_jO101riRSfbDsks5qF1XOgaJpZM4Io2Fg .
Sure, here are both.
I just noticed that the graph has lower case DNA bases. vg and other tools aren't handling them gracefully now. Do you have a version with upper case bases? I could also write a converter.
On Mon, May 30, 2016 at 4:54 PM Heather Gibling notifications@github.com wrote:
Sure, here are both.
prdm9-ABC.zip https://github.com/vgteam/vg/files/289836/prdm9-ABC.zip
— You are receiving this because you commented.
Reply to this email directly, view it on GitHub https://github.com/vgteam/vg/issues/364#issuecomment-222519341, or mute the thread https://github.com/notifications/unsubscribe/AAI4EShDi2HTbDxISKPxb0CxnMGqIdxeks5qGwgfgaJpZM4Io2Fg .
I can convert it via GFA, one second and I'll see if this is the problem.
On Mon, May 30, 2016 at 5:11 PM Erik Garrison erik.garrison@gmail.com wrote:
I just noticed that the graph has lower case DNA bases. vg and other tools aren't handling them gracefully now. Do you have a version with upper case bases? I could also write a converter.
On Mon, May 30, 2016 at 4:54 PM Heather Gibling notifications@github.com wrote:
Sure, here are both.
prdm9-ABC.zip https://github.com/vgteam/vg/files/289836/prdm9-ABC.zip
— You are receiving this because you commented.
Reply to this email directly, view it on GitHub https://github.com/vgteam/vg/issues/364#issuecomment-222519341, or mute the thread https://github.com/notifications/unsubscribe/AAI4EShDi2HTbDxISKPxb0CxnMGqIdxeks5qGwgfgaJpZM4Io2Fg .
That wasn't the problem. The issue is that the graph has no node 1. It starts from 51.
On Mon, May 30, 2016 at 5:13 PM Erik Garrison erik.garrison@gmail.com wrote:
I can convert it via GFA, one second and I'll see if this is the problem.
On Mon, May 30, 2016 at 5:11 PM Erik Garrison erik.garrison@gmail.com wrote:
I just noticed that the graph has lower case DNA bases. vg and other tools aren't handling them gracefully now. Do you have a version with upper case bases? I could also write a converter.
On Mon, May 30, 2016 at 4:54 PM Heather Gibling notifications@github.com wrote:
Sure, here are both.
prdm9-ABC.zip https://github.com/vgteam/vg/files/289836/prdm9-ABC.zip
— You are receiving this because you commented.
Reply to this email directly, view it on GitHub https://github.com/vgteam/vg/issues/364#issuecomment-222519341, or mute the thread https://github.com/notifications/unsubscribe/AAI4EShDi2HTbDxISKPxb0CxnMGqIdxeks5qGwgfgaJpZM4Io2Fg .
vg view prdm9-ABC.vg | sed 'y/atgc/ATGC/' | vg view -Fv - >prdm9-ABC.vg.1
vg index -x prdm9-ABC.xg prdm9-ABC.vg.1
vg find -n 51 -c 5 -x prdm9-ABC.xg | vg view -
Gives me:
H VN:Z:1.0 S 51 TGTGGACAAGGTTTCAGTGTTA P 51 A 1 + 22M L 51 + 52 + 0M S 52 AATCAGATGTTATTACACACCA P 52 A 2 + 22M L 52 + 53 + 0M S 53 AAGGACACATACAGGGGAGAAG P 53 A 3 + 22M L 53 + 54 + 0M S 54 CTCTACGTCTGCAGGGAGTGTG P 54 A 4 + 22M L 54 + 55 + 0M S 55 GGCGGGGCTTTAGCTGGAAGTC P 55 A 5 + 22M L 55 + 56 + 0M S 56 ACACCTCCTCATTCACCAGAGG P 56 A 6 + 22M
So the bug here is that we should check if we're in bounds in the node space and if not throw an error that actually describes the problem.
Ahh I forgot to convert the fasta to uppercase. Converting and then rebuilding the graph with msga results in the graph starting at node 1. Thanks for the help!
You can also use vg mod -c
to compact the id space after whatever construction or import process you use.
I've been using vg ids -s
. Is one recommended over the other?
Either one of those commands will work, but vg ids -s
also reorders nodes by a topological sort.
@ekg have we changed xg to work when your graph doesn't start at 1? Or to complain in that case with something better?
Hi,
I followed the link [https://github.com/vgteam/vg/wiki/working-with-a-whole-genome-variation-graph] for the construction of gcsa and xg indices on whole genome graph.
GCSA index succesfully got created and I used the following commands for the same
for chr in $(seq 1 22; echo X; echo Y);
do
vg mod -t 32 -pl 16 -S -t 16 -e 4 $chr.vg >$chr.prune.vg
vg mod -t 32 -N $chr.vg >$chr.ref.vg
cat $chr.ref.vg $chr.prune.vg | vg view -v -D - 2>$chr.merge.err >$chr.smooth.vg
vg kmers -gBk 16 -H 1000000000 -T 1000000001 $chr.smooth.vg >$chr.graph
done
But I got the following error when I am trying to build XG index.
vg/include/sdsl/int_vector.hpp:1360: sdsl::int_vector<<anonymous> >::reference sdsl::int_vector<<anonymous> >::operator[](const size_type&) [with unsigned char t_width = 0u; sdsl::int_vector<<anonymous> >::reference =
sdsl::int_vector_reference<sdsl::int_vector<0u>>; sdsl::int_vector<<anonymous> >::size_type = long unsigned int]: Assertion
idx < this->size()' failed.`
For xg index i used the following command
vg index -x wg.xg $(for i in $(seq 22; echo X; echo Y); do echo $i.vg; done)
Can you please correct my understanding and help to figure what's going wrong here ?
I've gotten this error a couple of times. Most recently using
vg find -n 1 -c 5 -x graph.xg
: the graph itself isn't hugeand I've successfully run that command on a much larger graph. The other time was using
vg msga
, though I don't remember the exact command I used. I'm requesting plenty of memory when I submit the command on my cluster.Thoughts on what might be going on?