miranov25 / RootInteractive

5 stars 12 forks source link

Memory consumption optimizations #356

Closed pl0xz0rz closed 3 months ago

pl0xz0rz commented 3 months ago

Relates to #355

This should reduce the memory used for ND histogram bin caching when most but not all points are selected (32 bits saved per selected point per histogram) and also reduce the memory used for intersection filters (use actual packed bitmasks instead of a 32 bit integer for each point)

To be added: more similar optimizations to reduce memory consumption

In test with multi weights, 5e5 points, 4 columns: Before: 256 MB After: 235 MB After using regular expression to remove unused variables from funCustom: 215 MB Now columns aren't a bottleneck in the test anymore, but still should be a bottleneck in realistic use case with 100+ columns

However, computing histograms is slower now 5e5 points Before: 57ms After: 100ms

miranov25 commented 3 months ago

Tests were OK, but in realistic use case we see that all columns are cached which takes all of them expanded. @pl0xz0rz -see my realistic test: /lustre/alice/users/miranov/NOTES/alice-tpc-notes2/JIRA/ATO-650/perfScanSecITSW.html As a consequence files wich when compressed has a size 260 MBy after reading and loadin custom function - maory consumption gos to the 2.8 GBy. In the console, we can see that all columns were expanded, while only subset should be used

miranov25 commented 3 months ago

Hi @pl0xz0rz,

Thanks for the update!

I've been testing memory consumption in a realistic scenario: dEdx calibration. For the compressed file "perfdEdx.html" (161 MB, last modified June 24, 2024, 10:44 AM), Chrome reports a significant improvement – memory usage is down to 450 MB, compared to the previous ~1.5 GB for similar files.

Potential Memory Usage Report:

I believe we can combine this information with the recent caching and uncaching changes (addressing user-defined cache columns can be a separate task) to create a comprehensive memory usage report.

What do you think? Should I merge now and further development will be done in the next full request?

miranov25 commented 3 months ago

Mergin. Further development for the #355 and #358 will follow in the next pull request