philipc / gkl-rs

Rust bindings for Intel Genomics Kernel Library (GKL)
Apache License 2.0
2 stars 1 forks source link

Smith-waterman aligner bug #12

Closed rhysnewell closed 2 years ago

rhysnewell commented 2 years ago

Hi Phil,

I've been getting this set of errors in Lorikeet when using gkl-rs to perform the smith-waterman alignment. It seems that ocassionally the alignment produced by gkl-rs will produce a CIGAR string that suggests that the read aligninment ends up extending past the reference. I don't have specifics nor a reproducible test case yet, but just wanted to flag it with you.

Here is an example error produced by Lorikeet when using gkl-rs:

[2022-02-07T04:10:04Z INFO  lorikeet] lorikeet version 0.6.2
[2022-02-07T04:10:04Z INFO  lorikeet_genome] Using min-covered-fraction 0%
[2022-02-07T04:10:04Z INFO  lorikeet_genome] Using min-read-aligned-percent 0%
[2022-02-07T04:10:04Z INFO  lorikeet_genome::utils::utils] Creating cache directory results/lorikeet/cryoconite/20220207/bam_files
[2022-02-07T04:10:04Z INFO  lorikeet_genome::utils::utils] Creating cache directory results/lorikeet/cryoconite/20220207/bam_files/short/
[2022-02-07T04:10:04Z INFO  lorikeet_genome::utils::utils] Not pre-generating minimap2 index
[2022-02-07T04:10:04Z WARN  lorikeet_genome::utils::utils] Not using reference index...
[2022-02-07T04:10:04Z INFO  lorikeet_genome::utils::utils] Creating cache directory results/lorikeet/cryoconite/20220207/bam_files/long/
[2022-02-07T04:10:04Z INFO  lorikeet_genome::utils::utils] Not pre-generating minimap2 index
[2022-02-07T04:10:04Z WARN  lorikeet_genome::utils::utils] Not using reference index...
[2022-02-07T04:10:05Z INFO  lorikeet_genome::processing::lorikeet_engine] Processing long reads...
[2022-02-07T04:11:30Z INFO  lorikeet_genome::processing::lorikeet_engine] Processing short reads...
thread '<unnamed>' panicked at 'Read goes past end of reference: rstart - 0, necessary length - 295, ref len - 275, cigar - [Ins(66), Match(10), Del(134), Match(7), Ins(29), Match(4), Ins(87), Match(3), Ins(2), Match(3), Ins(201), Match(6), Ins(51), Match(5), Ins(107), Match(5), Ins(6), Match(2), Ins(93), Match(3), Ins(83), Match(3), Ins(44), Match(4), Ins(81), Match(3), Ins(208), Match(47), Ins(15), Match(6), Del(1), Match(2), Ins(16), Match(3), Ins(7), Match(10), Del(1), Match(2), Del(4), Match(3), Del(1), Match(2), Ins(3), Match(5), Ins(2), Match(4), Ins(10), Match(2), Del(1), Match(2), Ins(23), Match(3), Ins(31), Match(4), Ins(25)], indel index - 54', src/reads/alignment_utils.rs:446:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread '<unnamed>' panicked at 'Read goes past end of reference: rstart - 0, necessary length - 286, ref len - 275, cigar - [Match(1), Del(1), Match(7), Ins(6), Match(3), Del(192), Match(2), Ins(48), Match(4), Ins(59), Match(3), Ins(67), Match(3), Ins(86), Match(3), Ins(73), Match(5), Ins(11), Match(3), Ins(212), Match(4), Ins(4), Match(4), Ins(354), Match(6), Ins(36), Match(3), Ins(35), Match(5), Ins(84), Match(2), Del(1), Match(4), Ins(7), Match(4), Ins(6), Del(26)], indel index - 36', src/reads/alignment_utils.rs:446:9
thread '<unnamed>' panicked at 'Read goes past end of reference: rstart - 0, necessary length - 295, ref len - 275, cigar - [Ins(193), Match(20), Ins(19), Match(4), Ins(12), Match(4), Ins(83), Match(5), Ins(35), Match(3), Ins(50), Match(5), Ins(3), Match(3), Ins(29), Match(3), Ins(9), Match(2), Ins(1), Match(1), Ins(45), Match(4), Ins(51), Match(3), Ins(2), Match(4), Ins(58), Match(3), Ins(35), Match(3), Ins(6), Match(4), Ins(141), Match(5), Ins(109), Match(157), Del(1), Match(1), Del(1), Match(4), Ins(2), Match(3), Ins(3), Match(3), Del(1), Match(4), Del(5), Match(1), Del(3), Match(4), Ins(2), Match(1), Del(3), Match(2), Del(2), Match(5), Del(14), Match(4), Ins(1)], indel index - 58', src/reads/alignment_utils.rs:446:9
thread '<unnamed>' panicked at 'Read goes past end of reference: rstart - 0, necessary length - 295, ref len - 275, cigar - [Ins(125), Match(55), Del(64), Ins(21), Match(5), Ins(13), Match(4), Ins(10), Match(3), Ins(22), Match(4), Del(1), Match(3), Ins(4), Match(1), Ins(70), Match(5), Ins(2), Match(8), Ins(102), Match(4), Ins(8), Match(3), Ins(30), Match(3), Ins(32), Match(4), Del(1), Match(2), Ins(77), Match(7), Ins(18), Match(2), Ins(9), Match(2), Ins(5), Match(2), Ins(9), Match(3), Ins(135), Match(4), Ins(8), Match(3), Ins(7), Match(5), Ins(6), Match(5), Ins(28), Match(3), Ins(9), Match(2), Ins(6), Match(2), Ins(10), Match(4), Ins(6), Match(6), Ins(19), Match(4), Ins(18), Match(4), Ins(4), Match(1), Ins(3), Match(6), Ins(1), Match(1), Ins(1), Match(2), Ins(3), Match(3), Ins(5), Match(2), Ins(7), Match(8), Ins(4), Match(3), Ins(59), Match(4), Ins(27), Match(4), Ins(14), Match(3), Ins(2), Match(2), Ins(24), Match(4), Ins(2), Match(2), Ins(9), Match(2), Ins(1), Match(3), Ins(26), Match(3), Ins(20), Match(3), Ins(28), Match(2), Ins(23), Match(3), Ins(11), Match(3), Ins(3), Match(3), Ins(144)], indel index - 105', src/reads/alignment_utils.rs:446:9
thread '<unnamed>' panicked at 'Read goes past end of reference: rstart - 0, necessary length - 306, ref len - 286, cigar - [Ins(121), Match(54), Del(72), Ins(44), Match(2), Ins(39), Match(3), Ins(10), Match(3), Del(4), Match(3), Ins(2), Match(4), Ins(12), Match(2), Ins(4), Match(1), Ins(29), Match(4), Ins(9), Match(4), Ins(7), Match(3), Ins(8), Match(4), Ins(11), Match(4), Ins(6), Match(4), Ins(46), Match(3), Ins(4), Match(3), Ins(39), Match(2), Ins(15), Match(4), Ins(7), Match(3), Ins(10), Match(3), Ins(25), Match(5), Ins(21), Match(1), Del(1), Match(6), Ins(1), Match(2), Ins(12), Match(3), Ins(5), Match(2), Ins(2), Match(3), Ins(33), Match(5), Ins(35), Match(3), Ins(11), Match(5), Ins(2), Match(2), Ins(23), Match(4), Ins(6), Match(12), Ins(40), Match(3), Ins(1), Match(2), Ins(27), Match(6), Ins(2), Match(1), Ins(1), Match(3), Ins(1), Match(2), Ins(8), Match(5), Ins(5), Match(4), Ins(4), Match(3), Ins(26), Match(5), Ins(2), Match(5), Ins(4), Match(3), Ins(14), Match(3), Ins(16), Match(5), Ins(34), Match(4), Ins(1), Match(4), Ins(4), Match(1), Ins(17), Match(4), Ins(130)], indel index - 103', src/reads/alignment_utils.rs:446:9
thread '<unnamed>' panicked at 'Never found start Some(29) or stop None given cigar [Match(69), Ins(16), Match(2), Ins(11), Match(4), Ins(39), Match(3), Ins(13), Match(2), Ins(25), Match(2), Ins(10), Match(3), Ins(3), Match(3), Ins(26), Match(5), Ins(81), Match(5), Ins(7), Match(2), Ins(84), Match(3), Ins(36), Match(2), Ins(70), Match(4), Ins(14), Match(2), Ins(65), Match(6), Ins(33), Match(2), Ins(10), Match(5), Ins(50), Match(5), Ins(82), Match(5), Ins(16), Match(4), Ins(47), Match(5), Ins(69), Match(8), Ins(52), Match(5), Ins(14), Match(3), Ins(1), Match(4), Ins(12), Match(4), Ins(68), Match(5), Ins(20), Match(5), Ins(6), Match(1), Ins(42), Match(5), Ins(11), Match(2), Ins(13), Match(2), Ins(128), Match(6), Ins(87)] ref start 29 ref end 225 offset 0 bases AATCGGAAGCAGTGGGAGATTCTAAAGCAGAGAAGAAACGGTTTGTTTCAACCGTTGAAAATGCTATCAGGGGTGATACATATGCAAGTATTCTAAATTCTTTCTAAGAATAAAACCAAGCATACTATTTTTAATTACGTACGACTAAAAAATATCGGACGATTATTTTTGCTCGTTTTTTATTAGCTTAAATTTTTTGGTTTGTTTAGCTTTATTTTGTTCCTCTCTTAGAATGCGATAAGCGAAAATATAATTTTCAGTCCTTACTTGTATTGAATTTTATCAAGCAGTCAAATAATCATCAAACAATCCACTGTCTATCGTTGTCACTTTCCAATTTCCCGCATTGGTTTTAGGGGCTTTGTAAAAAATTTAGTAAGGTGGATATAGTTGAATGTTCGAATTCCTACAAAAAACGAACTATATTAGAGAAAGACCATTGTCTTTTTTTAAAGCTTTTATCACAACCATTAAAAAATTAGCTATCAACACACTACAAATAAGGATTTTTATCAAGGTTTTCATTGTTCAAAAAAGATACTTAGTAGAAAATTTTAATTAACCTGTTTAAACAATAACTCTCTGCATCTGATTTCCATACAGCGCAGTTGTTCGGCTCTAAATTCAAATAAATTACAATAAACTCAAAAAGTTCTTTTTTTTTTATGCTCTCCGTCATAGTATTTTTATTTTTCTTAGTTTCAATTTATTTTCACACTTCCTTTGAATTTCAATAATTGCACTGATAAAACACCGCCGTGAATGTTTTCTGAATCAGACTATTTTGGGGTACTTCATAATCCTGCATTCTCTTTATTCGAGTTATAAACCTGTCTTGCTCTGTAAATGCCTGACTTGTAATCATTGTATCCTTTGTCAAAGCAGAAACGGTATGTTCATCACATTTTAGTTTTGTAAAATTACGATCGTTTGTTCTACCGGACTAAACCAAACTAAACTTGGAAACTTTTTTCATCCGGCATTTATAGTATGCGATTTTGACCTCCTCTTTTACTTCTTCTGTTAATAGCCTTTCTCTCAAGCAACACATTTTAGTATGTCTTTAAACAGGCTAGTAGTGCTGTCGCCAATTTTAACATTCTTCCCCAGCATCTTAGATCGACTGTCCGATGAAACATAACTATACTTTGAGGTAACTTATTATCAATTTCTTCTGAAAACTCTACTTTCTACACTTATTTGCGTCCGACAATCTGCTCTTTCCAGGAAGATGATTGATCAGTTTCCCTGACGTGAAATCAACGAAACTGATTTACTAAAAGTAGTACTCTCCCGCAAAAAGACTTTTAAAATCAACGCTTACCGTGACGTCAATCAAATCTTCGAGAAGTTGTGTAATGCCTATCCTACGGCGTTCGTTTCGGCAGTTTATCTTCCGCATTTAAATCAAATTTGGTTGGGAGCTTCTCCCGAAACCCTTGTTTCTCAAGATG', src/reads/alignment_utils.rs:825:13

This error disappears when I use my basic smith-waterman implementation. The annoying thing is that this is not picked up in the test cases, everything is green and good apparently. But yeah, seems like gkl-rs is adding either additional Matches or Deletions since the reference ends up needing to be longer for the alignment to make sense.

Cheers, Rhys

philipc commented 2 years ago

Which OverhangStrategy is this using? I know that OverhangStrategy::Ignore can return a longer alignment (I think this is the relevant code). Also, related to that, the return offset in Align should be isize, not usize, but I think you're already handling that.

If you can get a test case I'm happy to look into it.

rhysnewell commented 2 years ago

Okay, I don't think this is a problem with the actual algorithm implementation at all. I think this might be a C memory issue. The problem is only reproducible when you are performing multiple alignments in parallel. See https://github.com/rhysnewell/Lorikeet/blob/master/tests/smith_waterman_aligner_unit_tests.rs#L406

This test incorporates several test alignments using the basic smith waterman aligner and the gkl implementation across a number of alignment parameters and overhang strategies. I noticed that when running these alignments without using rayon, the tests never failed. The results would always be the same between the two alignment methods. As soon as I wrapped the whole thing in a rayon parallel iterator the started failing very consistently, with the gkl aligner giving completely different alignments than the standard implementation:

cargo test --release --test smith_waterman_aligner_unit_tests

test test_avx_mode ... FAILED

failures:

---- test_avx_mode stdout ----
thread '<unnamed>' panicked at 'assertion failed: `(left == right)`
  left: `Some([Match(18), Ins(879), Match(398)])`,
 right: `Some([Del(6), Ins(427), Match(9), Ins(89), Match(4), Ins(64), Match(3), Ins(15), Match(5), Ins(283), Match(396)])`: Alignments are not equal:
 Some([Match(18), Ins(879), Match(398)])
 Some([Del(6), Ins(427), Match(9), Ins(89), Match(4), Ins(64), Match(3), Ins(15), Match(5), Ins(283), Match(396)])', tests/smith_waterman_aligner_unit_tests.rs:446:13
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

failures:
    test_avx_mode

Additionally, sometimes the test would just hang indefinitely suggesting that there is a sporadic deadlock somewhere too.

I had no idea that using rayon would cause an issue like this, but I suppose it makes sense? I'm not sure how you would hunt down a fix for this. Hopefully the stable rust implementation will fix this bug. Pretty interesting interaction though, maybe play around with the test and see if you see the same thing. Definitely fails very consistently for me when in parallel, but safe single-threaded

philipc commented 2 years ago

That might be the fault of this global variable.

philipc commented 2 years ago

13 should fix it. I can do a patch release of that if you want.

rhysnewell commented 2 years ago

That would be great, thank you!

philipc commented 2 years ago

Published 0.1.1