QuantGen / BEDMatrix

A matrix-like wrapper around PLINK .bed files
Other
18 stars 2 forks source link

BEDMatrix instance has been unmapped #23

Closed TabeaSchoeler closed 1 year ago

TabeaSchoeler commented 2 years ago

Hi all, I have been trying to run a simple chunkedApply() function using parallel processing in R, either with slurm_apply or parLapply. In both cases, I get the error "BEDMatrix instance has been unmapped". I have no problems running the function below when using serial processing.

Any idea what's causing problems here?

Many thanks for your help, Tabea

chunkedApply(X = geno(bg), MARGIN = 2, FUN = sd, j="rs11665242_G")

agrueneberg commented 2 years ago

Hi Tabea,

parallel computing in R is hard. In hopefully simple terms, slurm_apply and parLapply create a new execution environment to run the given function, and a BEDMatrix object cannot travel into this environment, hence the error you get when trying to access the data. It should work if you create the BEDMatrix object within the function given to slurm_apply and parLapply.

That said, chunkedApply is already parallelized (using mclapply, which clones the current execution environment). The number of cores can be controlled with the nCores parameter (and should be set to 1 if nested within another parallel apply function). If you want to parallelize across nodes (which may make more sense on a compute cluster), please check the "Computations on subsets of a file-backed array" section in the paper.

Hope that helps, Alex