cositools / cosipy

The COSI high-level data analysis tools
Apache License 2.0
3 stars 16 forks source link

Refactored combine_unbinned_data to use memory-mapped files #190

Open krishnatejavedula opened 2 weeks ago

krishnatejavedula commented 2 weeks ago

Cloese issue #119

codecov[bot] commented 2 weeks ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 36.98%. Comparing base (65706f7) to head (a695d35).

:exclamation: Current head a695d35 differs from pull request most recent head 53cb5e0

Please upload reports for the commit 53cb5e0 to get more accurate results.

Files Coverage Δ
cosipy/data_io/UnBinnedData.py 95.37% <100.00%> (+0.17%) :arrow_up:
ckarwin commented 1 week ago

Hi @krishnatejavedula I looked over your changes and ran some test in order to profile the memory usage. I tested combining the crab+crab, as well as the crab+total_background from DC2. The memory usage vs. time are shown below:

memory_profile_1

memory_profile_2

These were obtained using memory_profiler, described here. As an additional sanity check, I also got the peak RAM usage directly from the linux command line (see method 4 here), which was consistent with the results shown here.

So it looks like your changes do not really improve things very much. Do you want to check this on your machine to make sure I'm not missing something? One possible thing to try is as follows:

  1. The purpose of the memory map is to utilize the disk memory instead of the RAM. But I don't think this is happening. Would it help to flush the memory maps, as described here, and then manually delete them from memory?

If the improvement is in fact minimal, as it appears, I think it would be better to not implement these changes. Specifically, the use of memory maps is more complicated than the previous method, and this has a higher risk of potentially causing problems down the road.