actuarialopensource / benchmarks

Some performance tests for actuarial applications
MIT License
14 stars 4 forks source link

Manual calculations for sizes of allocated objects in memory complexity experiments #49

Closed MatthewCaseres closed 11 months ago

MatthewCaseres commented 11 months ago

Idea is that if GC makes measuring memory usage of a process not reliable we can do our best to add up the size of each object inside the program itself and report back something like the sum of (array elements)*(element size) for each array in a cache or something.

Julia has Base.summarysize. this maybe a stronger argument than measuring the allocations?

serenity4 commented 11 months ago

I'm not sure this is a good idea. Manually computing memory usage requires intrusive code changes, and would probably be biased toward what we aim to compute. Base.summarysize is slightly imprecise, as AFAIK it over-estimates the results a bit, and it wouldn't take into account intermediate objects, unless we sprinkle it a bit everywhere in the algorithms.

At least, measuring allocations makes for a conservative upper bound, and we can take a look via the profiler where memory gets allocated and for what. Though the strongest argument IMHO would be that it is simpler to do, and less error-prone, even if we may get over-estimations. It can be a bit noisy, for sure, but again, estimating memory complexity is not an easy task for a memory-managed dynamic language. I think the data points we currently have should be enough, what do you think?

MatthewCaseres commented 11 months ago

I'm just trying to find a way to do this that I think has a good chance of working in Python, which is the motivation for this issue.

Thinking through what you say about the error-proneness of the approach I suggested, the issue can be closed. Reopened in a ticket where I try to explain what more I want from this repo in terms of big-O analysis.