Closed ionox0 closed 6 years ago
We have not intentionally introduced any stochastic functionality to ABRA. There is some pseudo-random downsampling of reads in regions of high coverage, however the same seed is always used when randomly selecting reads, so this should be repeatable.
That being said, we have not explicitly tested this and cannot at this time guarantee that all contigs and/or realignments are deterministic.
Thank you for the information 👍
I looked into this, and it seems that there is a possibility of non-deterministic results in repetitive regions at the ends of reads. However out of 4000 realigned reads, this inconsistency was only present in 50 of them.
Two specific examples I found were a difference of 2 bp in an insertion:
Or an insertion being repositioned to the other side of a repetitive region:
Sorry, this may actually be due to another issue, I'm rerunning this comparison
Thank you for you work developing this tool,
I'm trying to validate two pipelines that are being used at our institution. I was wondering whether the
YA
tag should always result in the same value between runs on the same input bam, or whether the assembly process could potentially result in different contigs between pipeline runs. Any information would be appreciated.Thank you, Ian