macs3-project / MACS

MACS -- Model-based Analysis of ChIP-Seq
https://macs3-project.github.io/MACS/
BSD 3-Clause "New" or "Revised" License
691 stars 270 forks source link

Q: HMMRATAC reproducibility #638

Open igordot opened 3 months ago

igordot commented 3 months ago

I noticed that HMMRATAC does not seem to be deterministic. If I run it multiple times, sometimes I get a different number of peaks. I see that there is a --randomSeed parameter that is set by default. Is that not controlling for everything?

taoliu commented 2 months ago

It is possible that certain functions that we used from Numpy/Scipy/sci-kit-learn/hmmlearn escaped the randomSeed setting. But we do have github action testing for macs3 code to make sure that everytime we update the codes, the results should be at least 'similar'. Do you know how big the difference is?

igordot commented 2 months ago

The differences are variable. For example, 4,000 to 5,400, 2,600 to 2,100, 3,100 to 8,100. Obviously, these are not very good samples, so that may have something to do with it.