gdrplatform / gDRcore

R package to process dose-response curve data with the GR methods
https://gdrplatform.github.io/gDRcore
1 stars 1 forks source link

No parallelization control in `create_SE` #13

Closed ChristopherEeles closed 1 year ago

ChristopherEeles commented 1 year ago

Hi gDR team,

Lack of ability for users to control parallelization inside of create_SE is very nearly crashing my computer. I run out of RAM and end up swapping.

This is happening with a relatively small dataset (6x6 drug combo, ~500 experiments), so I imagine it will only get worse as I try to incorporate larger ones.

Have you considered use of BiocParallel so users can configure their parallelization back-end themselves? It has the added benefit that you can pass in a "SerialParam" object for debugging, which enables stepping through execution.

At minimum you need a way to cap the number of threads used.

I have worked around this for now, but it will likely cause trouble for other users.

Best, Chris

ChristopherEeles commented 1 year ago

Ditto for map_df. There are probably more places that I haven't encountered yet.

ChristopherEeles commented 1 year ago

I see the environmental variable inside of detect_cores now and will use that to configure my parallelization.

gladkia commented 1 year ago

Thanks, we will start working on this enhancement during the current sprint.

bczech commented 1 year ago

@ChristopherEeles the solution with BiocParallel has been already implemented and merged into master. We'd be happy for your feedback.