Open pavlis opened 4 years ago
If you are able to reclaim the memory with the del
call, there shouldn't be a memory leak. You might just be seeing the behavior of Python's garbage collection. So, I am not sure it is necessary to actively call del
as that call is rarely seen in most of the Python codes. Also, if there is a memory leak, calling del
would not fix it anyway.
I agree that we probably should be more careful about potential memory leaks. I guess the first thing we need to make sure of is that there is no memory leaks on the C++ side. This should be easier to address as long as we have a suite of test cases. It will be tricky to detect memory leaks from the python side. I don't know any tools specialized for that kind of purpose, but I guess any standard memory debugging tools should work. There is not much difference in attaching the debugger to a pure C++ program with attaching that to a Python program that calls C++ library under the hood. I just did a test with valgrind, and it behaves normal with our test_ccore code. I assume we can just use that to detect memory leaks. From that test code I run, valgrind did not find any leaks in its "definitely lost" and "indirectly lost" categories. I guess that is a good sign.
There are plenty of ways to test for memory leaks on the pure C++ side from valgrind to more sophisticated debugging tools. I actually don't suspect that is a large issue, but there could be some small leaks. Having written most of the C++ code to this point I will state that I learned some basics ways of writing C++ code that make memory leaks rare to nonexistent. Basically, the point is to avoid pointers if at all possible and remember to avoid errors that create unclosed file errors.
What I'm more concerned here is the impedance mismatch between memory managed by C++ and python and how our pybind11 bindings do or do not handle that correctly. As I am sure you know pybind11 has features for handing this issue and they talk about it a lot in the documentation. Safest thing is to always make a deep copy going to and from the python side, but I don't think that always happens and sometimes you really don't want it to happen for efficiency. I guess the point is there are likely many subtle ways to create memory leaks or mysterious behavior in ccore functions because of the impedance mismatch in memory management pybind11 has to try to handle. We have to keep this in mind as we develop a comprehensive test suite.
Yep, I share the same concern, especially when I wrote some of those pybind11 code that interacts with the raw C python interface - it is pretty easy to screw up, and I believe those were implemented correctly as I have carefully tested the reference counters to ensure everything is correctly managed by the garbage collector. The pure pybind11 code is of less concern to me since if there is a memory leak there, it should be considered a pybind11 bug and we can (reasonably) assume it will be caught and resolved by the pybind11 users and developers. All we have to do is to keep the package up-to-date. I guess for now, we only need to take care of the lines that involves the raw C python interface. I probably should think of how to include a test for memory leaks in the test_ccore.py. That should be good enough for the current stage of development.
I knew this one was coming and we finally encountered it. As any experienced reader knows one of the dark sides of any C++ bindings to python is memory management. The two languages treat that problem very differently so all binds have to cleanly separate C++ memory management from python memory management.
I encountered a memory management problem in the code below that I eventually fixed, but it illuminated some issues we should be conscious of and work on as we progress in this project.
This block of code is a miniseed data file reader driven by a document created by indexing it earlier. I'll just dump the full code here and then describe the problem I'm seeing:
I'm trying to eat up and process these rather large miniseed files I produced from obspy's downloader. Each file is about a gig in size. When I run this function in a simple loop like this:
I can watch it eat up memory with the system monitor because each chunk is so huge. I seems the large block of memory in ens is not being released. I am pretty sure this is happening because ens is just a pointer to a block of memory defined, in this case, by a TimeSeriesEnsemble returned by the load_one_ensemble function. I may not understand python garbage collection correctly, but it doesn't work here quite the way I thought. After some a minor change where I added
del ens
at the bottom of this loop the memory problem went away.There are two points I think follow from this: