OpenSenseAction / OPENSENSE_sandbox

Collection of runable examples with software packages for processing opportunistic rainfall sensors
BSD 3-Clause "New" or "Revised" License
13 stars 16 forks source link

Limit memory usage for examples to not crash kernels when using binder #46

Open cchwala opened 2 years ago

cchwala commented 2 years ago

I have been dealing with killed kernel due to memory consumption in #44 and others have also experienced the same thing (see https://github.com/OpenSenseAction/OPENSENSE_sandbox/pull/37#issuecomment-1290724197).

Here is a link to an a article that explains why this happens on binder

Detecting CPU and RAM limits on mybinder.org

The problem is that one is restricted to 1 CPU and limited RAM in a binder pod, but when requesting info on the available resources, one get's what is physically available on the machine that hosts the pod. Hence, calculations or plotting will cause OOM kills (OOM = out of memory) because Python (or the used packages) think that they can comfortably increase memory usage further, but then the kernel dies e.g. at 2 GB RAM usage.

One solution, which I found a dask example notebook could be to explicitly set the memory_limit for dask like this

from dask.distributed import Client
client = Client(n_workers=1, threads_per_worker=4, processes=True, memory_limit='2GB')

It also makes sense to set n_workers=1 because we only have 1 CPU available in the binder pod, but dask might want to use those that are visible from the server on which the pod is running. Using more workers than CPUs could slow down calculations.

cchwala commented 1 year ago

For the record: If I recall correctly (edit: I did, and updated the link to the relevant comment https://github.com/OpenSenseAction/OPENSENSE_sandbox/pull/37#issuecomment-1290724197 in the intro text oft this PR) the OOM kills of the kernel also happened when plotting several time series with matplotlib. Hence, a solution that only limits memory usage of dask is not sufficient.

cchwala commented 1 year ago

Maybe a variation of this solution to limit RAM usage for a python program could help.

import resource

def set_memory_limit(max_mem_GB=2, soft_hard_fraction=0.9):
    hard_limit = max_mem_GB * 1e9 # not sure what the correct units are here...
    soft_limit = hard_limit * soft_hard_fraction
    resource.setrlimit(resource.RLIMIT_AS, (soft_limit, hard_limit))

(not tested locally, just typed it here, hence, there might be typos)

cchwala commented 1 year ago

The code from above did require to cast soft_limit and hard_limit to int. But, anyway, it did not work as expected.

This is the variation of above's code that I used

import resource

def set_memory_limit(max_mem_GB=2):#, soft_hard_fraction=0.9):
    soft, hard = resource.getrlimit(resource.RLIMIT_AS)
    soft = int(max_mem_GB * 1e9)
    #resource.setrlimit(resource.RLIMIT_AS, (int(soft_limit), int(hard_limit)))
    resource.setrlimit(resource.RLIMIT_AS, (soft, hard))

after

set_memory_limit()

that would give me

resource.getrlimit(resource.RLIMIT_AS)
(2000000000, -1)

But the only results is that when I run, e.g. ds = oddt.tranform_fencl_2021_Eband_data('data/fencl_2021_Eband_data/Dataset_1.0.0.zip') in the data example notebook, I get a long error message which contains e.g.

MemoryError: Unable to allocate output buffer.

Full trace is here:

```python Traceback (most recent call last): File "/srv/conda/envs/notebook/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3508, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "/tmp/ipykernel_1785/1906702126.py", line 1, in File "/home/jovyan/OPENSENSE_sandbox/notebooks/opensense_data_downloader_and_transformer.py", line 53, in tranform_fencl_2021_Eband_data df_data = pd.read_csv( File "/srv/conda/envs/notebook/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 912, in read_csv File "/srv/conda/envs/notebook/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 577, in _read parser = TextFileReader(filepath_or_buffer, **kwds) File "/srv/conda/envs/notebook/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 1407, in __init__ self._engine = self._make_engine(f, self.engine) File "/srv/conda/envs/notebook/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 1679, in _make_engine return mapping[engine](f, **self.options) File "/srv/conda/envs/notebook/lib/python3.10/site-packages/pandas/io/parsers/c_parser_wrapper.py", line 93, in __init__ self._reader = parsers.TextReader(src, **kwds) File "pandas/_libs/parsers.pyx", line 548, in pandas._libs.parsers.TextReader.__cinit__ header, table_width, unnamed_cols = self._get_header(prelim_header) File "pandas/_libs/parsers.pyx", line 637, in pandas._libs.parsers.TextReader._get_header self._tokenize_rows(hr + 2) File "pandas/_libs/parsers.pyx", line 848, in pandas._libs.parsers.TextReader._tokenize_rows self._check_tokenize_status(status) File "pandas/_libs/parsers.pyx", line 859, in pandas._libs.parsers.TextReader._check_tokenize_status raise_parser_error("Error tokenizing data", self.parser) File "pandas/_libs/parsers.pyx", line 2014, in pandas._libs.parsers.raise_parser_error raise exc_type(old_exc) MemoryError: Unable to allocate output buffer. During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/srv/conda/envs/notebook/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 2105, in showtraceback stb = self.InteractiveTB.structured_traceback( File "/srv/conda/envs/notebook/lib/python3.10/site-packages/IPython/core/ultratb.py", line 1428, in structured_traceback File "/srv/conda/envs/notebook/lib/python3.10/site-packages/IPython/core/ultratb.py", line 1319, in structured_traceback return VerboseTB.structured_traceback( File "/srv/conda/envs/notebook/lib/python3.10/site-packages/IPython/core/ultratb.py", line 1172, in structured_traceback formatted_exception = self.format_exception_as_a_whole(etype, evalue, etb, number_of_lines_of_context, File "/srv/conda/envs/notebook/lib/python3.10/site-packages/IPython/core/ultratb.py", line 1062, in format_exception_as_a_whole self.get_records(etb, number_of_lines_of_context, tb_offset) if etb else [] File "/srv/conda/envs/notebook/lib/python3.10/site-packages/IPython/core/ultratb.py", line 1130, in get_records mod = inspect.getmodule(cf.tb_frame) File "/srv/conda/envs/notebook/lib/python3.10/inspect.py", line 874, in getmodule _filesbymodname[modname] = f MemoryError Traceback (most recent call last): File "/srv/conda/envs/notebook/lib/python3.10/site-packages/IPython/core/events.py", line 93, in trigger func(*args, **kwargs) File "/srv/conda/envs/notebook/lib/python3.10/site-packages/IPython/extensions/autoreload.py", line 713, in post_execute_hook newly_loaded_modules = set(sys.modules) - self.loaded_modules MemoryError During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/srv/conda/envs/notebook/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 2105, in showtraceback stb = self.InteractiveTB.structured_traceback( File "/srv/conda/envs/notebook/lib/python3.10/site-packages/IPython/core/ultratb.py", line 1428, in structured_traceback File "/srv/conda/envs/notebook/lib/python3.10/site-packages/IPython/core/ultratb.py", line 1319, in structured_traceback return VerboseTB.structured_traceback( File "/srv/conda/envs/notebook/lib/python3.10/site-packages/IPython/core/ultratb.py", line 1172, in structured_traceback formatted_exception = self.format_exception_as_a_whole(etype, evalue, etb, number_of_lines_of_context, File "/srv/conda/envs/notebook/lib/python3.10/site-packages/IPython/core/ultratb.py", line 1062, in format_exception_as_a_whole self.get_records(etb, number_of_lines_of_context, tb_offset) if etb else [] File "/srv/conda/envs/notebook/lib/python3.10/site-packages/IPython/core/ultratb.py", line 1159, in get_records res = list(stack_data.FrameInfo.stack_data(etb, options=options))[tb_offset:] File "/srv/conda/envs/notebook/lib/python3.10/site-packages/stack_data/core.py", line 597, in stack_data yield from collapse_repeated( File "/srv/conda/envs/notebook/lib/python3.10/site-packages/stack_data/utils.py", line 83, in collapse_repeated yield from map(mapper, original_group) File "/srv/conda/envs/notebook/lib/python3.10/site-packages/stack_data/core.py", line 587, in mapper return cls(f, options) File "/srv/conda/envs/notebook/lib/python3.10/site-packages/stack_data/core.py", line 551, in __init__ self.executing = Source.executing(frame_or_tb) File "/srv/conda/envs/notebook/lib/python3.10/site-packages/executing/executing.py", line 359, in executing source = cls.for_frame(frame) File "/srv/conda/envs/notebook/lib/python3.10/site-packages/executing/executing.py", line 277, in for_frame return cls.for_filename(frame.f_code.co_filename, frame.f_globals or {}, use_cache) File "/srv/conda/envs/notebook/lib/python3.10/site-packages/executing/executing.py", line 306, in for_filename return cls._for_filename_and_lines(filename, tuple(lines)) File "/srv/conda/envs/notebook/lib/python3.10/site-packages/executing/executing.py", line 317, in _for_filename_and_lines result = source_cache[(filename, lines)] = cls(filename, lines) File "/srv/conda/envs/notebook/lib/python3.10/site-packages/executing/executing.py", line 257, in __init__ self.tree = ast.parse(ast_text, filename=filename) File "/srv/conda/envs/notebook/lib/python3.10/ast.py", line 50, in parse return compile(source, filename, mode, flags, MemoryError ```
cchwala commented 1 year ago

See this related response on stackoverflow: What happens when python reaches it's soft memory limit?