cogeotiff / rio-tiler

User friendly Rasterio plugin to read raster datasets.
https://cogeotiff.github.io/rio-tiler/
BSD 3-Clause "New" or "Revised" License
511 stars 106 forks source link

is there a memory leak? #713

Open vincentsarago opened 4 months ago

vincentsarago commented 4 months ago

Dear users, I'm trying to debug some possible memory leak issue with rio-tiler (this subject has been brought many time over the rasterio repo and it seems that is there is a leak it might be at users application level, e.g rio-tiler). Before raising a new issue in rasterio I would like to be sure that's it's not in rio-tiler 😅

Where ?

I'm out of ideas/clues of what is going on so if someone wants to help, it will be really appreciated 🙏

python scripts: memory_mosaic.py

```python from rio_tiler.io import Reader from rio_tiler.mosaic import mosaic_reader from random import sample import morecantile import rasterio from rio_tiler.models import ImageData from memory_profiler import profile from rio_tiler.mosaic.methods import LowestMethod assets = [ "tests/fixtures/mosaic_value_1.tif", "tests/fixtures/mosaic_value_2.tif", "tests/fixtures/mosaic_value_1.tif", "tests/fixtures/mosaic_value_2.tif", ] tilematrixset = morecantile.tms.get("WebMercatorQuad") w, s, e, n = [-76, 44.93, -71.33, 47.1] minzoom = 7 maxzoom = 9 extrema = {} for zoom in range(minzoom, maxzoom + 1): ul_tile = tilematrixset.tile(w, n, zoom) lr_tile = tilematrixset.tile(e, s, zoom) extrema[zoom] = { "x": {"min": ul_tile.x, "max": lr_tile.x + 1}, "y": {"min": ul_tile.y, "max": lr_tile.y + 1}, } def _reader(path: str, x: int, y: int, z: int, **kwargs): with Reader(path) as src: return src.tile(x, y, z, **kwargs) @profile() def mosaic(x, y, z): # while True: im, _ = mosaic_reader( assets, _reader, x, y, z, threads=1, tilesize=4000, pixel_selection=LowestMethod, ) im = None return True @profile() def image_from_list(x, y, z): imgs = [] for asset in assets: try: im = _reader(asset, x, y, z, tilesize=4000) imgs.append(im) except: pass img = ImageData.create_from_list(imgs) imgs = None img = None return True @profile() def images(x, y, z): for asset in assets: try: im = _reader(asset, x, y, z, tilesize=4000) im = None except: pass return True @profile() def image(x, y, z): with Reader(assets[0]) as src: im = src.tile(x, y, z, tilesize=4000) im = None return True if __name__ == '__main__': z = sample(range(minzoom, maxzoom + 1), 1)[0] x = sample(range(extrema[z]["x"]["min"], extrema[z]["x"]["max"]), 1)[0] y = sample(range(extrema[z]["y"]["min"], extrema[z]["y"]["max"]), 1)[0] _ = image(x, y, z) _ = images(x, y, z) _ = mosaic(x, y, z) _ = image_from_list(x, y, z) ```

results

python -m memory_profiler memory_mosaic.py
Filename: memory_mosaic.py

Line #    Mem usage    Increment  Occurrences   Line Contents
=============================================================
    83    191.2 MiB    191.2 MiB           1   @profile()
    84                                         def image(x, y, z):
    85    198.2 MiB      7.0 MiB           1       with Reader(assets[0]) as src:
    86    533.8 MiB    335.6 MiB           1           im = src.tile(x, y, z, tilesize=4000)
    87    533.8 MiB      0.0 MiB           1           im = None
    88                                         
    89    533.8 MiB      0.0 MiB           1       return True

Filename: memory_mosaic.py

Line #    Mem usage    Increment  Occurrences   Line Contents
=============================================================
    72    533.8 MiB    533.8 MiB           1   @profile()
    73                                         def images(x, y, z):
    74    581.2 MiB     -2.1 MiB           5       for asset in assets:
    75    581.2 MiB      0.0 MiB           4           try:
    76    581.2 MiB     45.2 MiB           4               im = _reader(asset, x, y, z, tilesize=4000)
    77    581.2 MiB     -2.1 MiB           4               im = None
    78                                                 except:
    79                                                     pass
    80                                         
    81    579.1 MiB     -2.1 MiB           1       return True

Filename: memory_mosaic.py

Line #    Mem usage    Increment  Occurrences   Line Contents
=============================================================
    37    579.1 MiB    579.1 MiB           1   @profile()
    38                                         def mosaic(x, y, z):
    39                                             # while True:
    40    856.8 MiB    277.8 MiB           2       im, _ = mosaic_reader(
    41    579.1 MiB      0.0 MiB           1           assets,
    42    579.1 MiB      0.0 MiB           1           _reader,
    43    579.1 MiB      0.0 MiB           1           x,
    44    579.1 MiB      0.0 MiB           1           y,
    45    579.1 MiB      0.0 MiB           1           z,
    46    579.1 MiB      0.0 MiB           1           threads=1,
    47    579.1 MiB      0.0 MiB           1           tilesize=4000,
    48    579.1 MiB      0.0 MiB           1           pixel_selection=LowestMethod,
    49                                             )
    50    856.8 MiB      0.0 MiB           1       im = None
    51                                         
    52    856.8 MiB      0.0 MiB           1       return True

Filename: memory_mosaic.py

Line #    Mem usage    Increment  Occurrences   Line Contents
=============================================================
    55    857.0 MiB    857.0 MiB           1   @profile()
    56                                         def image_from_list(x, y, z):
    57    857.0 MiB      0.0 MiB           1       imgs = []
    58   1042.9 MiB      0.0 MiB           5       for asset in assets:
    59    912.4 MiB      0.0 MiB           4           try:
    60   1042.9 MiB    185.9 MiB           4               im = _reader(asset, x, y, z, tilesize=4000)
    61   1042.9 MiB      0.0 MiB           4               imgs.append(im)
    62                                                 except:
    63                                                     pass
    64                                         
    65   1498.0 MiB    455.2 MiB           1       img = ImageData.create_from_list(imgs)
    66   1498.0 MiB      0.0 MiB           1       imgs = None
    67    948.7 MiB   -549.3 MiB           1       img = None
    68                                         
    69    948.7 MiB      0.0 MiB           1       return True

The issue was reported by a titiler-pgstac application user, resulting in memory increase like 👇

image
wildintellect commented 4 months ago

We decided to see if Valgrind could provide something more useful. @chuckwondo ran Scalene and it was showing that the memory usage is almost entirely outside python itself.

valgrind --tool=memcheck --leak-check=full --suppressions=valgrind-python.supp --log-file=minimal.valgrind.log python memory_mosaic.py

minimal.valgrind.log

vincentsarago commented 1 month ago

👀 https://github.com/rasterio/rasterio/issues/2932