fmicompbio / monaLisa

binned motif enrichment analysis and visualisation
https://fmicompbio.github.io/monaLisa/
GNU General Public License v3.0
36 stars 6 forks source link

not reproducible results - set.seed is not considered by calcBinnedMotifEnrR #77

Closed NathanHarmston closed 3 months ago

NathanHarmston commented 3 months ago

Hi,

So I am using calcBinnedMotifEnrR to calculate enrichment for a set of regions compared to the genome. However, the results are not reproducible between different runs of the same piece of code.

set.seed(42)
se.dc1 <- calcBinnedMotifEnrR(seqs = prom.seqs.dc1, 
                              background="genome", 
                              genome.regions = bg, 
                              pwmL=pwms, 
                              genome=BSgenome.Hsapiens.UCSC.hg38, genome.oversample=20, min.score="80%")

As an example one time - I get argfx, lin54, zic2 as significant, the next time only argfx appears as significant.

It seems that sampling from genome.regions does not consider the call to set.seed - is there anyway to fix this issue?

csoneson commented 3 months ago

Hi @NathanHarmston - did you try to set the RNGseed parameter of the SerialParam/MulticoreParam instance set for the BPPARAM argument? See the Details section of ?calcBinnedMotifEnrR:

genome : sequences randomly sampled from the genome (or the intervals defined in genome.regions if given) [...] In order to make the sampling deterministic, a seed number needs to be provided to the RNGseed parameter in SerialParam or MulticoreParam when creating the BiocParallelParam instance in BPPARAM.

NathanHarmston commented 3 months ago

figured I was being a muppet! Thanks