zeeev / wham

Structural variant detection and association testing
Other
100 stars 26 forks source link

Stochastic output #51

Open mwalker174 opened 4 years ago

mwalker174 commented 4 years ago

We are observing non-deterministic behavior of whamg when run multiple times across cloud VMs on the same BAM file, where about 5-10% of calls are different. I see from a quick search of the code base that random sampling is used to estimate metrics (this seems to be seeded to the system clock with srand in wham and not in whamg), but from the logs it looks like these estimates converge consistently.

Can anyone reproduce this behavior?

Either way, it would also be very useful to set the random seed manually from the command line.

zeeev commented 4 years ago

Hi @mwalker174,

I've marked this as a question, but it's possible it is a bug. Let me think about this for a bit.