2i2c-org / docs

Documentation for 2i2c community JupyterHubs.
https://docs.2i2c.org
9 stars 17 forks source link

Increasing cores and killing processes #69

Closed JILPulvino closed 3 years ago

JILPulvino commented 3 years ago

Within an Rstudio session I looked to add a number of cores and create a cluster for a calculation using the parallel package. Using that package, I used makeCluster() to add a number of cores to a cluster at which point the process just hung. I killed the process and started a new Rstudio session. This time, using benchmarkme I attempted to test the time for a matrix calculation using four cores at which point again the process hung. Is this behavior expected?

Within the terminal, I checked the processes running ps aux and the multiple R processes I initialized are listed, but are zombies. I attempted to kill them, but was unable to remove them. How might I remove these zombie processes? I didn't want to kill the Rstudio parent process as I was unsure if that would affect other things.

yuvipanda commented 3 years ago

Sorry to hear you're having troubles.

Can you provide some example code that exhibits this hanging behavior? Will help me debug.

You can always start and stop the server (from https:///hub/home) to get a clean slate.

JILPulvino commented 3 years ago

library(parallel) cl <- makeCluster(8)

or

install.packages('benchmarkme') library(benchmarkme) benchmark_matrix_cal(cores=8)

In briefly testing this:

damianavila commented 3 years ago

@JILPulvino what is the output of detectCores() using the parallel package?

Given this line...

going to 4 cores caused no problems, but going to 8 started the hang

I would bet you have there 4 available cores and trying to use more than that is somehow, maybe, causing the issue...

JILPulvino commented 3 years ago

I also thought this might be the issue, but detectCores() always returns 2 when I start up an Rstudio session.

I've also just checked, and when I start up 4 cores use makeCluster() while it initially starts, if I wait a few moments and then check in terminal ps aux those additional processes are being killed off fairly quickly. I'm not sure if this is expected behavior?

damianavila commented 3 years ago

If you explicitly use 2 cores when you makeCluster, those are killed as well? Just curious...

I am not actually sure about the expected behavior for R parallelizing on kube-based sort of deployments, but I presume there might be some issues if the parallel (or any other) package is detecting the underlying node arch and trying to do stuff on that assumption when it actually has access to a part of it (through the pod). Btw, this is just speculation at the moment, so take my words inside that context :wink:

choldgraf commented 3 years ago

Hey all - what's the status on this one? Is there anything to be done?

yuvipanda commented 3 years ago

Being killed is almost always a function of using too much memory, rather than code. Unfortunately I don't know of a good way to check memory usage in RStudio - but if you have a notebook open at the same time, it'll show you that on the top right. We can bump up your memory if that's the case, and that should help a bit.

JILPulvino commented 3 years ago

Sorry, I fell off on checking this. We found an alternative method that doesn't require us to try and use as much memory - we don't think - that I need to experiment some with. We also just need to do some experimentation on our end with the proper way to scale memory in both R and Python in kubernetes.

choldgraf commented 3 years ago

Thanks for the update @JILPulvino -I'm gonna close this one, though we can reopen if need be, or open a new one as you explore more.