Closed rotemdan closed 6 years ago
So far (in my own build) I've applied the flag to the outputs of the methods:
compute_from_file
and compute_from_data
in cmfcc_py.c
(cmfcc
submodule)read_audio_data
in cwave_py.c
(cwave
sobmodule)And it seems to help.
(These are all the calls to PyArray_SimpleNewFromData
that I could find in the codebase)
There are also various calls to methods like:
PyArray_SimpleNew
PyArray_ContiguousFromAny
Which I'm not sure exactly if require this modification.
This means I can only give a pull request for the ones I've found so far. I don't think I can cover everything without being more familiar with the code.
Thank you for your feedback and contribution.
I think to remember that, when I developed the C extensions, I ran them through Valgrind, so I am a bit surprised by the appearance of this issue. On the other hand, I think almost all applications using aeneas just run a few executions before the host process dies, so it might also be the case that the memory leak never surfaced because of it.
Before merging your patch, I need to read a bit more about numpy memory management. I plan to do that in the upcoming weekend.
Best regards,
Alberto Pettarin
On 05/03/2018 05:38 PM, Rotem Dan wrote:
So far (in my own build) I've applied the flag on out outputs of the methods:
- |compute_from_file| and |compute_from_data| in |cmfcc_py.c| (|cmfcc| submodule)
- |read_audio_data| in |cwave_py.c| (|cwave| sobmodule)
And it seems to help. (These are all the calls to |PyArray_SimpleNewFromData| that I could find in the codebase)
There also various calls to methods like:
- |PyArray_SimpleNew|
- |PyArray_ContiguousFromAny|
- ..
Which I'm not sure exactly if require this modification.
This means I can only give a pull request for the ones I've found so far. I don't think I can cover everything without being more familiar with the code.
I merged your patch, thank you for your contribution.
Now I remember that I run under Valgrind the pure C code, not the Python wrappers, because of the many false positives that CPython generates when run under Valgrind (and the lack of time to actually craft a suitable white/black list).
The following causes the process' memory usage to grow indefinitely (calls to
del
andgc.collect()
have no effect):I believe the reason is that the underlying c arrays are not freed as the
OWNDATA
flag is not explicitly enabled on the returnedndarray
objects:Based on what I read, applying:
on each of the returned ndarrays should fix the issue. Based on this stackoverflow question it seems like it is not possible to do it from python (the flag is not writable), it has to be done within the C code itself (also: second stackoverflow question).
More info from the official numpy documentation: