Confirmed non-determinism in the DRAGMAP aligner output

droazen commented 2 years ago

The pipelines team here at the Broad Institute has confirmed non-determinism in the DRAGMAP output -- here is their report on the issue:

Description

When the same unmapped bam is run through the Dragmap aligner twice, the resulting aligned bams do not always match.

This was discovered when running the DRAGEN-GATK whole genome germline single sample pipeline. In order to confirm that the Dragmap aligner was producing different results, it was isolated run twice on the same input, then the outputs were compared. This was repeated 20 times. The single sample used (NA12878) has 24 unmapped bams which run through the Dragmap aligner individually. This was a total of 480 comparisons (24 unmapped bams aligned twice and outputs compared (later referred to as shards), all repeated 20 times (later referred to as runs)). Of the 480 comparisons, 47 resulted in differences. These differences were not consistent across runs; that is, for the 24 shards for a single run, sometimes 1 would fail, sometimes 3 would fail (or 2, or 4) and which specific shards failed also varied. What was consistent was that in every run at least 1 of the 24 shards failed.

Steps to reproduce

Run Dragmap aligner on an input bam twice and compare the outputs. This may need to be repeated several times since the differences are not consistent across runs (most of the time the alignment produces identical results, but sometimes it produces different results).

The WDL used for the experiment described above can be found here: https://github.com/broadinstitute/warp/blob/jw_test_Dragmap_alignment/scratch/DragmapAlign.wdl

And the actual Dragmap command line used is in this WDL:

https://github.com/broadinstitute/warp/blob/jw_test_Dragmap_alignment/tasks/broad/DragmapAlignment.wdl

Actual behavior

Approximately 10% of the time that Dragmap is run on the same input twice, the outputs are not identical.

Expected behavior

Each time you run the same input through Dragmap, it produces the same output.

Supporting files and details

Of the 47 failing comparisons mentioned above, we looked closer at one. When these output bams are compared using the Picard tool CompareSams the result is as follows:

We can arrange to send you the actual differing output bams from this test if it would be helpful in diagnosing the problem.

rpetrovski commented 2 years ago

I'm having trouble finding out what dragmap command line was used from the above. One of the likely ways to get non-deterministic results is to not specify --preserve-map-align-order true As the default is false, the blocks of aligned data will not be ordered in the original sequence resulting the trivial order variation as well as potential differences in the insert size statistics collection for paired data.

rizkg commented 2 years ago

@rpetrovski , from what I get, the picard tool compareSams does not care about output order and compares actual alignments. So problem may be deeper than that. Can --preserve-map-align-order true change something else than output order ? I am getting files and command lines to reproduce from separate thread.

droazen commented 2 years ago

@rpetrovski The Dragmap command is on line 64 of https://github.com/broadinstitute/warp/blob/jw_test_Dragmap_alignment/tasks/broad/DragmapAlignment.wdl:

dragen-os -b ~{input_bam} -r dragen_reference --interleaved=1 2> >(tee ~{output_bam_basename}.dragmap.stderr.log >&2) | samtools view -h -O BAM - > aligned.bam

The differences are more than just trivial differences in ordering -- we first noticed this because the downstream variant calls changed.

droazen commented 2 years ago

(I'll add that among other things, the total number of aligned reads was found to differ across runs)

droazen commented 2 years ago

Here's a selection from the alignment summary metrics of a particular failing run showing the kinds of differences we're seeing:

rizkg commented 2 years ago

Hi, Our first assumption is that when not using --preserve-map-align-order true , then the insert sizes computation that is done at the beginning on first batch of reads may vary by a very small margin. It may then impact mapping of a few reads down the line. If that hypothesis is true, then we can fix undeterministic behavior with either:

--preserve-map-align-order true but this will have a small impact on performance
Or by setting insert size at command line if you know it already. For example with --Aligner.pe-stat-mean-insert 337 --Aligner.pe-stat-mean-read-len 151 --Aligner.pe-stat-quartiles-insert 256 319 409 --Aligner.pe-stat-stddev-insert 114

rizkg commented 2 years ago

Hi, we have been able to reproduce the issue with your command line. The insert size statistics computed at the beginning does vary a little from one run to the other (e.g. from 337.38 to 337.41) We then tried with setting --preserve-map-align-order true and it fixed the issue, so this non determinism indeed seem to be caused by the varying insert size estimates. Can you try with --preserve-map-align-order true ?

droazen commented 2 years ago

@rizkg I uploaded a pair of example differing output bams today to the link you provided over email. Can you confirm that the differences present in those output bams could be caused by the insert size statistics? Thanks for your help!

rizkg commented 2 years ago

Yes thanks, I got the files. I will check if there is some other issue.

skchronicles commented 2 years ago

@rizkg @droazen I just saw this issue was still open, and I was curious if this problem was addressed in the May release (1.3.0) of DRAGMAP or if you are still determining the cause of the issue?

TonyKess commented 2 years ago

@rizkg @droazen I just saw this issue was still open, and I was curious if this problem was addressed in the May release (1.3.0) of DRAGMAP or if you are still determining the cause of the issue?

I am wondering the same thing. For building pipelines for non-model species, is it "safe" to use dragmap yet, or should we default to bwa-mem?

kbergin commented 2 years ago

Hi all, apologies for the delay on this, we've been short staffed. We have a PR with the suggested change from James and will prioritize testing it to confirm it fixes the issue as soon as we can.

Illumina / DRAGMAP

Confirmed non-determinism in the DRAGMAP aligner output #31