results not reproducible even random seed '-S' is set

jeffchen2000 commented 5 years ago

when I run shorah.py, I set '-S 100', but the results are not reproducible, (I downloaded and installed shorah in Nov-2018)

DrYak commented 5 years ago

How much variations are you seeing ?
Which version are you using ?

In older versions, there was a bug causing memory corruption. This is fixed in version v1.1.3.

This would have caused completely random-looking results in a few corner cases.

The computation of results is dependent on the math library and is thus implementation-dependent. Running it on a Mac laptop and on a Linux HPC cluster can yield slightly different results (and the diri_sampler step is computed in log-space so its typically affected by this type of rounding implementation differences).

This would have cause results which are roughly the same, save for fraction-of-% difference in the scores (and probably different ordering in the CSV file).

We're seeing the same kind of difference after having swapped the math engines between Shorah 1.x and upcoming Shorah 2.0

DrYak commented 4 years ago

We will release previews of ShoRAH 2.x shortly. Please note that in the newer version, -R is now used for seed and -S is for the sigma parameter when analysing strand bias.

It has a different random engine than ShoRAH 1.x, so even by user the same seed parameters, the result will differ.

cbg-ethz / shorah

results not reproducible even random seed '-S' is set #59