fsspec / kerchunk

Cloud-friendly access to archival data
https://fsspec.github.io/kerchunk/
MIT License
305 stars 78 forks source link

Memory leak in `scan_grib`? #378

Closed TomAugspurger closed 12 months ago

TomAugspurger commented 12 months ago

I'm trying to track down what looks like a memory leak in a process using kerchunk. By 26,000 calls to scan_grib, memory usage in the process had grown from ~150 MB to ~2800 MB.

I see that eccodes has a codes_release function. Is anyone more familiar with GRIB (@dcherian maybe?) know if we should be calling that? My initial tests are just segfaulting :/

release
Fatal Python error: Segmentation fault

Current thread 0x00007fac44505740 (most recent call first):
  File "/home/taugspurger/mambaforge2/envs/ecmwf-forecast/lib/python3.11/site-packages/gribapi/gribapi.py", line 472 in grib_release
  File "/home/taugspurger/mambaforge2/envs/ecmwf-forecast/lib/python3.11/site-packages/cfgrib/messages.py", line 124 in __del__
  File "/home/taugspurger/src/stactools-packages/ecmwf-forecast/src/stactools/ecmwf_forecast/stac.py", line 382 in _create_item_from_parts
  File "/home/taugspurger/src/stactools-packages/ecmwf-forecast/src/stactools/ecmwf_forecast/stac.py", line 304 in create_item
  File "/home/taugspurger/src/stactools-packages/ecmwf-forecast/test.py", line 19 in <module>

In case it matters, I'm using this as an example, using this branch for stactools.ecmwf_forecast:

```python import urllib.request import pathlib from stactools.ecmwf_forecast import stac import psutil proc = psutil.Process() href = 'https://ai4edataeuwest.blob.core.windows.net/ecmwf/20231002/00z/0p4-beta/wave/20231002000000-0h-wave-fc.grib2' index = 'https://ai4edataeuwest.blob.core.windows.net/ecmwf/20231002/00z/0p4-beta/wave/20231002000000-0h-wave-fc.grib2' p = pathlib.Path("ecmwf/20231002/00z/0p4-beta/wave/20231002000000-0h-wave-fc.grib2") p.parent.mkdir(exist_ok=True, parents=True) filename, _ = urllib.request.urlretrieve(href, filename=p) # idx_filename = urllib.request.urlretrieve(index) for i in range(1000): _ = stac.create_item([href, index], split_by_step=True) if (i % 10) == 0: print(i, proc.memory_info().rss / (1024 * 1024)) ```
TomAugspurger commented 12 months ago

Hmm, seems like cfgrib.Message might free the memory when it goes out of scope at https://github.com/ecmwf/eccodes-python/blob/88ce860383f60afeb34119a31886d5a2e684d767/eccodes/highlevel/message.py#L28 by calling codes_release. So we shouldn't need to / can't call it too.

TomAugspurger commented 12 months ago

Closing this until I have a better idea what's going on :/