Closed reds-heig closed 4 years ago
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.
About this PR, as observed in the benchmarking, we have too much memory used in python when the code calls the ripser
implementation.
Let me give more details, if you measure the memory consumed only by ripser
for computing the torus
dataset. I obtain around 8 GB of RAM used in maximum. But if I do the same with our ripser_interface
, without the collapser
I run out of memory with a machine having 32GB of RAM. With the Collapser I have around 20GB used in maximum during the computation.
After some investigation I have done, this is because some data is kept alive:
X
: input datadm
: which is also keep alive in the context but we don't require it for the computation inside of ripser
dperm2all
: Should really this value be returned from python ripser
(we had a discussion on this with @ulupo)We need to think of a way to better manage this, because it can become a limitation when working on "big" datasets. Right now out of my head, I think about the following solutions:
None
in order to all the garbage collector in python to free the memoryclean_memory=True
is set.I don't think there's a "best" way of achieving this, but let discuss this.
@ulupo, do you think it could be good idea to integrate to this PR a first version of an heuristic to automatically determine into using or not the collapser
?
After some investigation I have done, this is because some data is kept alive:
- X : input data
- dm : which is also keep alive in the context but we don't require it for the computation inside of ripser
- dperm2all : Should really this value be returned from python ripser (we had a discussion on this with @ulupo)
I think this issue only affects the case in which metric != 'precomputed'
. If metric == 'precomputed'
and n_perm=0
(default) then both dm
and dperm2all
are simply references to X
(no new data created -- but please check if I'm wrong here), and we can never delete X
from memory as it is user input which he/she might need to use for other tasks.
So the question is what to do with dm
and dperm2all
when metric != 'precomputed'
. Starting with dm
: we probably don't have to worry in the sparse case (if the input is sparse, it is probably light-weight also). So we only need to worry about the dense case. We could run del dm
immediately before the call to DRFDM
in line 300. About dperm2all
, I'm happy for it to be None
when n_perm = 0
for now (so you can just change line 240 to dperm2all = None
).
@ulupo, do you think it could be good idea to integrate to this PR a first version of an heuristic to automatically determine into using or not the collapser ?
Sure, if you have one, why not. But this PR needs to be merged quite soon (at the very latest end of next week).
I think this issue only affects the case in which metric != 'precomputed'. If metric == 'precomputed' and n_perm=0 (default) then both dm and dperm2all are simply references to X (no new data created -- but please check if I'm wrong here), and we can never delete X from memory as it is user input which he/she might need to use for other tasks.
You're absolutely right !
So the question is what to do with dm and dperm2all when metric != 'precomputed'. Starting with dm: we probably don't have to worry in the sparse case (if the input is sparse, it is probably light-weight also). So we only need to worry about the dense case. We could run del dm immediately before the call to DRFDM in line 300. About dperm2all, I'm happy for it to be None when n_perm = 0 for now (so you can just change line 240 to dperm2all = None).
I'll run some test to see if it does free memory.
Sure, if you have one, why not. But this PR needs to be merged quite soon (at the very latest end of next week).
I'll try to make a detailed benchmarking with the dataset o3_1024 (1024 points):
What I would like is to see if we can observe something like a threshold like dim == 2 && nb_points > 300
then use Collapser. Does it make sense to you or would you have a different approach ?
Well, right now I'm really too busy with other tasks unfortunately.
IMO, we could already merge this PR, and hopefully in a future not too far, I would add the missing parts:
Julián
@MonkeyBreaker, I understand.
Concerning
Manage better the memory in the python part in some cases
If you like, I can merge this PR and then quickly open another with the changes I suggested in my previous messages. They should be easy to implement pretty much exactly as I said then.
Sure it's a good idea !
Reference issues/PRs
None
Types of changes
Description
Benchmarking showed that with big datasets (e.g. Torus 50000 point-cloud), the copy performed when calling the bindings becomes costly in terms of performance (runtime and memory).
To prevent this, the bindings where updated to accept by reference
numpy
matrices/arrays.Some clean-up and improvements were done inside
ripser_interface
.Screenshots (if appropriate)
Any other comments?
Checklist
flake8
to check my Python changes.pytest
to check this on Python tests.