amplab / snap

Scalable Nucleotide Alignment Program -- a fast and accurate read aligner for high-throughput sequencing data
https://www.microsoft.com/en-us/research/project/snap/
Apache License 2.0
287 stars 66 forks source link

Inconsistent BAM output from identical inputs #72

Open blahah opened 8 years ago

blahah commented 8 years ago

I noticed that on some of the test data we use for transrate, I occasionally see a run where the number of alignments in the SNAP output BAM file is much lower than what it should be.

The input dataset can be found here, and the two sets of results are linked below:

The version of SNAP used was v1.0dev.96

commands used

the commands were identical between runs

./snap-aligner index example_data/one/assembly.fa assembly -s 23 -t8 -bSpace -locationSize 4
./snap-aligner paired assembly example_data/left.fq example_data/right.fq \
  -o left.fq.right.fq.assembly.bam \
  -s 0 1000 \
  -H 300000 -h 2000 \
  -d 30 -t 8 -b -M -D 5 \
  -om 5 -omax 10

It seems as though the alignment process completed very similarly in both runs, because the logged output was identical except for the speed:

run 1 (logs)

Loading index from directory... 0s.  36562 bases, seed size 23
Aligning.
Total Reads    Aligned, MAPQ >= 10    Aligned, MAPQ < 10     Unaligned              Too Short/Too Many Ns  Extra Alignments  %Pairs    Reads/s   Time in Aligner (s)
20,000         13,874 (69.37%)        6,126 (30.63%)         0 (0.00%)              0 (0.00%)              4,357             100.00%   173,913   0
Welcome to SNAP version 1.0dev.96.

run 2 (logs)

Loading index from directory... 0s.  36562 bases, seed size 23
Aligning.
Total Reads    Aligned, MAPQ >= 10    Aligned, MAPQ < 10     Unaligned              Too Short/Too Many Ns  Extra Alignments  %Pairs    Reads/s   Time in Aligner (s)
20,000         13,874 (69.37%)        6,126 (30.63%)         0 (0.00%)              0 (0.00%)              4,357             100.00%   160,000   0
Welcome to SNAP version 1.0dev.96.

However, comparing the BAMfiles written out, we can see that run1 failed to report a large proportion of the alignments:

run1 (bamtools stats)

Total reads:       11800
Mapped reads:      11800    (100%)
Forward strand:    5900 (50%)
Reverse strand:    5900 (50%)
Failed QC:         0    (0%)
Duplicates:        0    (0%)
Paired-end reads:  11800    (100%)
'Proper-pairs':    11800    (100%)
Both pairs mapped: 11800    (100%)
Read 1:            5900
Read 2:            5900
Singletons:        0    (0%)

run2 (bamtools stats)

Total reads:       28714
Mapped reads:      28714    (100%)
Forward strand:    14357    (50%)
Reverse strand:    14357    (50%)
Failed QC:         0    (0%)
Duplicates:        0    (0%)
Paired-end reads:  28714    (100%)
'Proper-pairs':    28714    (100%)
Both pairs mapped: 28714    (100%)
Read 1:            14357
Read 2:            14357
Singletons:        0    (0%)

Both commands had an exit status of 0, so SNAP didn't throw any error.

I would be very grateful for your help in figuring out what's happening - at the moment this bug means that our software that depends on SNAP produces results that are not reproducible, so it's a priority for me.

macmanes commented 8 years ago

I have observed a similar issue as well. Any advice greatly appreciated.

bolosky commented 8 years ago

Sorry not to reply earlier, my spam filter decided the snap mailing list was something I shouldn't see.

This seems very bad, and not something I have seen before. I wonder if it is a Linux only problem, nearly all the dev work is done on widows. It's not even writing as many reads as are in the input file, so maybe somehow the process is exciting before all of the buffers are flushed; it is a very small input file.

Anyway, I'm on vacation now, so I probably can't get to it until next week. Maybe Ravi has some time earlier.

Sent from my Windows Phone


From: Richard Smith-Unnamailto:notifications@github.com Sent: ‎4/‎4/‎2016 10:12 AM To: amplab/snapmailto:snap@noreply.github.com Subject: [amplab/snap] Inconsistent BAM output from identical inputs (#72)

I noticed that on some of the test data we use for transrate, I occasionally see a run where the number of alignments in the SNAP output BAM file is much lower than what it should be.

The input dataset can be found herehttps://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fbintray.com%2fartifact%2fdownload%2fblahah%2fgeneric%2fexample_data.tar.gz&data=01%7c01%7cbolosky%40microsoft.com%7c79bae9ad407449c3ce8308d35cc560e6%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=xGjmpZVyyYgmibjhyYGB5ZDBB5WAbv3H67p5TjtOHfg%3d, and the two sets of results are linked below:

The version of SNAP used was v1.0dev.96

commands used

the commands were identical between runs

./snap-aligner index example_data/one/assembly.fa assembly -s 23 -t8 -bSpace -locationSize 4 ./snap-aligner paired assembly example_data/left.fq example_data/right.fq \ -o left.fq.right.fq.assembly.bam \ -s 0 1000 \ -H 300000 -h 2000 \ -d 30 -t 8 -b -M -D 5 \ -om 5 -omax 10

It seems as though the alignment process completed very similarly in both runs, because the logged output was identical except for the speed:

run 1 (logs)

Loading index from directory... 0s. 36562 bases, seed size 23 Aligning. Total Reads Aligned, MAPQ >= 10 Aligned, MAPQ < 10 Unaligned Too Short/Too Many Ns Extra Alignments %Pairs Reads/s Time in Aligner (s) 20,000 13,874 (69.37%) 6,126 (30.63%) 0 (0.00%) 0 (0.00%) 4,357 100.00% 173,913 0 Welcome to SNAP version 1.0dev.96.

run 2 (logs)

Loading index from directory... 0s. 36562 bases, seed size 23 Aligning. Total Reads Aligned, MAPQ >= 10 Aligned, MAPQ < 10 Unaligned Too Short/Too Many Ns Extra Alignments %Pairs Reads/s Time in Aligner (s) 20,000 13,874 (69.37%) 6,126 (30.63%) 0 (0.00%) 0 (0.00%) 4,357 100.00% 160,000 0 Welcome to SNAP version 1.0dev.96.

However, comparing the BAMfiles written out, we can see that run1 failed to report a large proportion of the alignments:

run1 (bamtools stats)

Total reads: 11800 Mapped reads: 11800 (100%) Forward strand: 5900 (50%) Reverse strand: 5900 (50%) Failed QC: 0 (0%) Duplicates: 0 (0%) Paired-end reads: 11800 (100%) 'Proper-pairs': 11800 (100%) Both pairs mapped: 11800 (100%) Read 1: 5900 Read 2: 5900 Singletons: 0 (0%)

run2 (bamtools stats)

Total reads: 28714 Mapped reads: 28714 (100%) Forward strand: 14357 (50%) Reverse strand: 14357 (50%) Failed QC: 0 (0%) Duplicates: 0 (0%) Paired-end reads: 28714 (100%) 'Proper-pairs': 28714 (100%) Both pairs mapped: 28714 (100%) Read 1: 14357 Read 2: 14357 Singletons: 0 (0%)

Both commands had an exit status of 0, so SNAP didn't throw any error.

I would be very grateful for your help in figuring out what's happening - at the moment this bug means that our software that depends on SNAP produces results that are not reproducible, so it's a priority for me.

— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHubhttps://github.com/amplab/snap/issues/72

bolosky commented 8 years ago

Even though I’m on vacation, I started looking at this, and the dataset you provide doesn’t have the same fasta as the example you give. It has transcripts.fa rather than one/assembly.fa.

My guess is that this doesn’t matter in the least, since it’s probably unrelated to the actual alignment, but in case you think it might be, please let me know.

--Bill

From: Richard Smith-Unna [mailto:notifications@github.com] Sent: Monday, April 4, 2016 10:12 AM To: amplab/snap snap@noreply.github.com Subject: [amplab/snap] Inconsistent BAM output from identical inputs (#72)

I noticed that on some of the test data we use for transrate, I occasionally see a run where the number of alignments in the SNAP output BAM file is much lower than what it should be.

The input dataset can be found herehttps://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fbintray.com%2fartifact%2fdownload%2fblahah%2fgeneric%2fexample_data.tar.gz&data=01%7c01%7cbolosky%40microsoft.com%7c79bae9ad407449c3ce8308d35cc560e6%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=xGjmpZVyyYgmibjhyYGB5ZDBB5WAbv3H67p5TjtOHfg%3d, and the two sets of results are linked below:

The version of SNAP used was v1.0dev.96

commands used

the commands were identical between runs

./snap-aligner index example_data/one/assembly.fa assembly -s 23 -t8 -bSpace -locationSize 4

./snap-aligner paired assembly example_data/left.fq example_data/right.fq \

-o left.fq.right.fq.assembly.bam \

-s 0 1000 \

-H 300000 -h 2000 \

-d 30 -t 8 -b -M -D 5 \

-om 5 -omax 10

It seems as though the alignment process completed very similarly in both runs, because the logged output was identical except for the speed:

run 1 (logs)

Loading index from directory... 0s. 36562 bases, seed size 23

Aligning.

Total Reads Aligned, MAPQ >= 10 Aligned, MAPQ < 10 Unaligned Too Short/Too Many Ns Extra Alignments %Pairs Reads/s Time in Aligner (s)

20,000 13,874 (69.37%) 6,126 (30.63%) 0 (0.00%) 0 (0.00%) 4,357 100.00% 173,913 0

Welcome to SNAP version 1.0dev.96.

run 2 (logs)

Loading index from directory... 0s. 36562 bases, seed size 23

Aligning.

Total Reads Aligned, MAPQ >= 10 Aligned, MAPQ < 10 Unaligned Too Short/Too Many Ns Extra Alignments %Pairs Reads/s Time in Aligner (s)

20,000 13,874 (69.37%) 6,126 (30.63%) 0 (0.00%) 0 (0.00%) 4,357 100.00% 160,000 0

Welcome to SNAP version 1.0dev.96.

However, comparing the BAMfiles written out, we can see that run1 failed to report a large proportion of the alignments:

run1 (bamtools stats)

Total reads: 11800

Mapped reads: 11800 (100%)

Forward strand: 5900 (50%)

Reverse strand: 5900 (50%)

Failed QC: 0 (0%)

Duplicates: 0 (0%)

Paired-end reads: 11800 (100%)

'Proper-pairs': 11800 (100%)

Both pairs mapped: 11800 (100%)

Read 1: 5900

Read 2: 5900

Singletons: 0 (0%)

run2 (bamtools stats)

Total reads: 28714

Mapped reads: 28714 (100%)

Forward strand: 14357 (50%)

Reverse strand: 14357 (50%)

Failed QC: 0 (0%)

Duplicates: 0 (0%)

Paired-end reads: 28714 (100%)

'Proper-pairs': 28714 (100%)

Both pairs mapped: 28714 (100%)

Read 1: 14357

Read 2: 14357

Singletons: 0 (0%)

Both commands had an exit status of 0, so SNAP didn't throw any error.

I would be very grateful for your help in figuring out what's happening - at the moment this bug means that our software that depends on SNAP produces results that are not reproducible, so it's a priority for me.

— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHubhttps://github.com/amplab/snap/issues/72

bolosky commented 8 years ago

I couldn’t get this to repro either in Windows or Linux.

What OS version are you using, and how often do you see the truncated output? Maybe I just haven’t tried it enough times.

--Bill

From: Richard Smith-Unna [mailto:notifications@github.com] Sent: Monday, April 4, 2016 10:12 AM To: amplab/snap snap@noreply.github.com Subject: [amplab/snap] Inconsistent BAM output from identical inputs (#72)

I noticed that on some of the test data we use for transrate, I occasionally see a run where the number of alignments in the SNAP output BAM file is much lower than what it should be.

The input dataset can be found herehttps://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fbintray.com%2fartifact%2fdownload%2fblahah%2fgeneric%2fexample_data.tar.gz&data=01%7c01%7cbolosky%40microsoft.com%7c79bae9ad407449c3ce8308d35cc560e6%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=xGjmpZVyyYgmibjhyYGB5ZDBB5WAbv3H67p5TjtOHfg%3d, and the two sets of results are linked below:

The version of SNAP used was v1.0dev.96

commands used

the commands were identical between runs

./snap-aligner index example_data/one/assembly.fa assembly -s 23 -t8 -bSpace -locationSize 4

./snap-aligner paired assembly example_data/left.fq example_data/right.fq \

-o left.fq.right.fq.assembly.bam \

-s 0 1000 \

-H 300000 -h 2000 \

-d 30 -t 8 -b -M -D 5 \

-om 5 -omax 10

It seems as though the alignment process completed very similarly in both runs, because the logged output was identical except for the speed:

run 1 (logs)

Loading index from directory... 0s. 36562 bases, seed size 23

Aligning.

Total Reads Aligned, MAPQ >= 10 Aligned, MAPQ < 10 Unaligned Too Short/Too Many Ns Extra Alignments %Pairs Reads/s Time in Aligner (s)

20,000 13,874 (69.37%) 6,126 (30.63%) 0 (0.00%) 0 (0.00%) 4,357 100.00% 173,913 0

Welcome to SNAP version 1.0dev.96.

run 2 (logs)

Loading index from directory... 0s. 36562 bases, seed size 23

Aligning.

Total Reads Aligned, MAPQ >= 10 Aligned, MAPQ < 10 Unaligned Too Short/Too Many Ns Extra Alignments %Pairs Reads/s Time in Aligner (s)

20,000 13,874 (69.37%) 6,126 (30.63%) 0 (0.00%) 0 (0.00%) 4,357 100.00% 160,000 0

Welcome to SNAP version 1.0dev.96.

However, comparing the BAMfiles written out, we can see that run1 failed to report a large proportion of the alignments:

run1 (bamtools stats)

Total reads: 11800

Mapped reads: 11800 (100%)

Forward strand: 5900 (50%)

Reverse strand: 5900 (50%)

Failed QC: 0 (0%)

Duplicates: 0 (0%)

Paired-end reads: 11800 (100%)

'Proper-pairs': 11800 (100%)

Both pairs mapped: 11800 (100%)

Read 1: 5900

Read 2: 5900

Singletons: 0 (0%)

run2 (bamtools stats)

Total reads: 28714

Mapped reads: 28714 (100%)

Forward strand: 14357 (50%)

Reverse strand: 14357 (50%)

Failed QC: 0 (0%)

Duplicates: 0 (0%)

Paired-end reads: 28714 (100%)

'Proper-pairs': 28714 (100%)

Both pairs mapped: 28714 (100%)

Read 1: 14357

Read 2: 14357

Singletons: 0 (0%)

Both commands had an exit status of 0, so SNAP didn't throw any error.

I would be very grateful for your help in figuring out what's happening - at the moment this bug means that our software that depends on SNAP produces results that are not reproducible, so it's a priority for me.

— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHubhttps://github.com/amplab/snap/issues/72

blahah commented 8 years ago

hi @bolosky sorry that I missed the notifications on this, and thanks very much for taking time out during your holiday to look at it!

I've run some tests and I think the problem is specific to OSX - I can't reproduce it on Linux.

The following script helps see the issue. It requires bamtools v1.0 or greater

#!/bin/bash
for i in `seq 1 20`;
do
  rm -rf assembly *bam
  snap-aligner index example_data/transcripts.fa assembly \
    -s 23 -t8 -bSpace -locationSize 4
  snap-aligner paired assembly example_data/left.fq example_data/right.fq \
    -o left.fq.right.fq.assembly.bam \
    -s 0 1000 -H 300000 -h 2000 -d 30 \
    -t 8 -b -M -D 5 -om 5 -omax 10
  bamtools stats -in left.fq.right.fq.assembly.bam
done

saving it as test_snap.sh and running it as follows in the parent directory of example_data extracted from the link above...

./test_snap.sh 2>/dev/null | tee test_snap.log | grep 'Total reads' | sort | uniq -c

on OSX I get:

   1 Total reads:       11788
  17 Total reads:       28714
   1 Total reads:       2954
   1 Total reads:       5840

whereas on Linux I get:

     20 Total reads:       28714

Because most of our users are on Linux this is not as urgent as previously thought, but it would be really nice to figure it out. I'm guessing the none of the SNAP team are on OSX as there haven't been OSX builds for the recent release. Any ideas how I can debug this?