Open cchwala opened 2 years ago
For the record: If I recall correctly (edit: I did, and updated the link to the relevant comment https://github.com/OpenSenseAction/OPENSENSE_sandbox/pull/37#issuecomment-1290724197 in the intro text oft this PR) the OOM kills of the kernel also happened when plotting several time series with matplotlib. Hence, a solution that only limits memory usage of dask is not sufficient.
Maybe a variation of this solution to limit RAM usage for a python program could help.
import resource
def set_memory_limit(max_mem_GB=2, soft_hard_fraction=0.9):
hard_limit = max_mem_GB * 1e9 # not sure what the correct units are here...
soft_limit = hard_limit * soft_hard_fraction
resource.setrlimit(resource.RLIMIT_AS, (soft_limit, hard_limit))
(not tested locally, just typed it here, hence, there might be typos)
The code from above did require to cast soft_limit
and hard_limit
to int
. But, anyway, it did not work as expected.
This is the variation of above's code that I used
import resource
def set_memory_limit(max_mem_GB=2):#, soft_hard_fraction=0.9):
soft, hard = resource.getrlimit(resource.RLIMIT_AS)
soft = int(max_mem_GB * 1e9)
#resource.setrlimit(resource.RLIMIT_AS, (int(soft_limit), int(hard_limit)))
resource.setrlimit(resource.RLIMIT_AS, (soft, hard))
after
set_memory_limit()
that would give me
resource.getrlimit(resource.RLIMIT_AS)
(2000000000, -1)
But the only results is that when I run, e.g. ds = oddt.tranform_fencl_2021_Eband_data('data/fencl_2021_Eband_data/Dataset_1.0.0.zip')
in the data example notebook, I get a long error message which contains e.g.
MemoryError: Unable to allocate output buffer.
Full trace is here:
See this related response on stackoverflow: What happens when python reaches it's soft memory limit?
I have been dealing with killed kernel due to memory consumption in #44 and others have also experienced the same thing (see https://github.com/OpenSenseAction/OPENSENSE_sandbox/pull/37#issuecomment-1290724197).
Here is a link to an a article that explains why this happens on binder
Detecting CPU and RAM limits on mybinder.org
The problem is that one is restricted to 1 CPU and limited RAM in a binder pod, but when requesting info on the available resources, one get's what is physically available on the machine that hosts the pod. Hence, calculations or plotting will cause OOM kills (OOM = out of memory) because Python (or the used packages) think that they can comfortably increase memory usage further, but then the kernel dies e.g. at 2 GB RAM usage.
One solution, which I found a dask example notebook could be to explicitly set the
memory_limit
fordask
like thisIt also makes sense to set
n_workers=1
because we only have 1 CPU available in the binder pod, butdask
might want to use those that are visible from the server on which the pod is running. Using more workers than CPUs could slow down calculations.