pyproj4 / pyproj

Python interface to PROJ (cartographic projections and coordinate transformations library)
https://pyproj4.github.io/pyproj
MIT License
1.07k stars 215 forks source link

Small memory leak #1418

Closed skinkie closed 4 months ago

skinkie commented 4 months ago

Code Sample, a copy-pastable example if possible

import pyproj # yes this simple

Run this code in heaptrack, and notice that likely osgeo::proj::io::DatabaseContext::create should be freed.

heaptrack python -c "import pyproj"

Problem description

I am currently evaluating performance issues of a python program with respect to memory leaks. I have noticed that, while little, pyproj also has a contribution to it. It is about 2.6MB, small.

Expected Output

No memory allocation expected at the end of the program. But one may also wonder why the DatabaseContext is already created without any calls. I think it may be more elegant to wait for the first call that requires it.

Environment Information

3.6.1 / 9.4.0 / 3.12.4 / Linux-6.9.9-arch1-1-x86_64-with-glibc2.39

Installation method

pacman

djhoese commented 4 months ago

I'm not sure how I feel about considering this a "leak". If I'm understanding it correctly this is pyproj initializing some state (yes, a database connection, but still). While I'm surprised by the size, who knows what sqlite is doing to pre-load the PROJ database. While it might improve import time to move the database connection/context to be created on first use, this would make execution time inconsistent. This makes performance checks and benchmarking more difficult as you'd have to "warm up" pyproj/PROJ before running anything.

That said, I don't have a good understanding of low-level pyproj, but this is just my initial feeling about this request.

snowman2 commented 4 months ago

1419 addresses this.

skinkie commented 4 months ago

@snowman2 thanks for the effort.