cbg-ethz / shorah

Repo for the software suite ShoRAH (Short Reads Assembly into Haplotypes)
GNU General Public License v3.0
39 stars 14 forks source link

results not reproducible even random seed '-S' is set #59

Open jeffchen2000 opened 5 years ago

jeffchen2000 commented 5 years ago

when I run shorah.py, I set '-S 100', but the results are not reproducible, (I downloaded and installed shorah in Nov-2018)

DrYak commented 5 years ago

In older versions, there was a bug causing memory corruption. This is fixed in version v1.1.3.

The computation of results is dependent on the math library and is thus implementation-dependent. Running it on a Mac laptop and on a Linux HPC cluster can yield slightly different results (and the diri_sampler step is computed in log-space so its typically affected by this type of rounding implementation differences).

We're seeing the same kind of difference after having swapped the math engines between Shorah 1.x and upcoming Shorah 2.0

DrYak commented 4 years ago

We will release previews of ShoRAH 2.x shortly. Please note that in the newer version, -R is now used for seed and -S is for the sigma parameter when analysing strand bias.

It has a different random engine than ShoRAH 1.x, so even by user the same seed parameters, the result will differ.