BlueBrain / libsonata

A python and C++ interface to the SONATA format
https://libsonata.readthedocs.io/en/stable/
GNU Lesser General Public License v3.0
12 stars 12 forks source link

CircuitConfig.from_file may fail with generic RuntimeError and cause OOM #260

Closed GianlucaFicarelli closed 1 year ago

GianlucaFicarelli commented 1 year ago

When loading a circuit from a wrong big (4.5GB) file by mistake, as in this example:

import libsonata

input_path = "/gpfs/bbp.cscs.ch/project/proj134/workflow-outputs/genrich-test/morphologyAssignmentConfig/root/nodes.h5"
cc = libsonata.CircuitConfig.from_file(input_path)

The result is:

Traceback (most recent call last):
  File "./tmp/try_libsonata.py", line 4, in <module>
    cc = libsonata.CircuitConfig.from_file(input_path)
RuntimeError
12.17user 5.02system 0:17.29elapsed 99%CPU (0avgtext+0avgdata 9285064maxresident)k
0inputs+0outputs (0major+18023minor)pagefaults 0swaps

Instead, in other cases of invalid json, you can get something like:

RuntimeError: [json.exception.parse_error.101] parse error at line 1, column 1: syntax error while parsing value - invalid literal; last read: 'm'

Possible issues:

1uc commented 1 year ago

the file is fully read in memory and it may cause an OOM

When I looked at it, it was most likely this one. See, https://github.com/BlueBrain/libsonata/blob/eb794f802e8f4dcfe8a36e56df828f2e08059533/src/config.cpp#L802

mgeplf commented 1 year ago

I have a fix in progress, I shouldn't be reading the full contents of the file :)

mgeplf commented 1 year ago

This is fixed by #261