testdockerwcsim / XTriggerApplication

Other
2 stars 3 forks source link

Fix memory leaks #55

Closed tdealtry closed 4 years ago

tdealtry commented 4 years ago

WCSim/WCSim#286 fixes a big memory leak in WCSimReader This fixes a big memory leak in DataOut, hopefully making things stable enough to close #51. It also does a bit more fiddling to remove a couple of other leaks

Using the same input file (from post WCSim/WCSim#286 - 10 20 MeV electrons) the results are identical before/after the change

After this, using htop to view the memory usage on my laptop, it goes up by about 2% on 10000 events (chaining the same file 1000 times), rather that 50%

brichards64 commented 4 years ago

does this mean there are still some minor leaks?

tdealtry commented 4 years ago

Potentially maybe. I've been trying to find time to look more today. Will try again tomorrow... Also @ast0815 gets a seg fault So not ready to go yet

brichards64 commented 4 years ago

seg is probably the coment on code i made which is you tfiles are deleted when closed so probably a double free seg

ast0815 commented 4 years ago

My segfault does not occur in the Finalise though:

[2]: ********************************************************
**** Executing toolchain 1 times ****
********************************************************

[3]: DEBUG: Event 409 of 28273
[3]: DEBUG: Current event is event 408 from WCSim file wcsim.root Tree offset is 0
[3]: DEBUG: 1st digit time before shifting: 3.50593e+07 ns
[3]: DEBUG: Digit 0 has T 0, Q 1.2007 on PMT 2414
[3]: DEBUG: Digit 1 has T 4.4, Q 0.988853 on PMT 528
[3]: DEBUG: Digit 2 has T 16.4, Q 1.85935 on PMT 2729
[3]: DEBUG: Digit 3 has T 24.8, Q 0.678261 on PMT 14954
[3]: DEBUG: Digit 4 has T 22, Q 0.838663 on PMT 8646
[3]: DEBUG: Digit 5 has T 29.6, Q 1.05024 on PMT 1709
[3]: DEBUG: Digit 6 has T 32.8, Q 1.03981 on PMT 1725
[3]: DEBUG: Digit 7 has T 30.4, Q 1.04993 on PMT 16040
[3]: DEBUG: Digit 8 has T 35.6, Q 1.39811 on PMT 16206
[3]: DEBUG: Digit 9 has T 49.6, Q 0.990435 on PMT 14321
[3]: DEBUG: Saved information on 995 digits
[3]: DEBUG: Preparing 1 ID samples
[4]: DEBUG: Sorting sample
[3]: DEBUG: Preparing 0 OD samples
[3]:  qqq Number of data samples 1
[3]: DEBUG: NHits::AlgNDigits(). Number of entries in input digit collection: 995
[2]: INFO: Found 2 NDigit trigger(s) from ID
[3]: DEBUG: DataOut::Execute Starting
[2]: INFO: Have 2 triggers to save times:
[2]: INFO:      [3.50609e+07 ns, 3.50622e+07 ns] 3.50613e+07 ns with type NDigits extra info 386
[2]: INFO:      [3.50633e+07 ns, 3.50646e+07 ns] 3.50637e+07 ns with type NDigits extra info 112

...

===========================================================
#5  0x00007f6b8b5adf14 in WCSimRootEventHeader::Set (this=0x10, i=408, r=0, d=35063670, s=2) at include/WCSimRootEvent.hh:170
#6  0x00007f6b8b5a9ab9 in WCSimRootTrigger::SetHeader (this=0x0, i=408, run=0, date=35063670, subevent=2) at src/WCSimRootEvent.cc:249
#7  0x00007f6b909a68de in DataOut::CreateSubEvents (this=0x1000520, WCSimEvent=0x1db4d70) at UserTools/Factory/../DataOut/DataOut.cpp:208
#8  0x00007f6b909a63b0 in DataOut::Execute (this=0x1000520) at UserTools/Factory/../DataOut/DataOut.cpp:136
#9  0x00007f6b90745fd2 in ToolChain::Execute (this=0x7ffde762dac0, repeates=1) at ToolDAQ/ToolDAQFramework/src/ToolChain/ToolChain.cpp:298
#10 0x00007f6b90743e7b in ToolChain::ToolChain (this=0x7ffde762dac0, configfile="210343377000000000000000Pr
000000000000000
343b347375177", '000' <repeats 18 times>, "005345016212k177000000000000000000000000000000H343b347375177000000000000000000002000000000=016
", '000' <repeats 13 times>, "277332߃t>$352Pr
000000000000000
343b347375177", '000' <repeats 18 times>, "277332?G314360337025277ڥJ024*363024", '000' <repeats 64 times>...<Address 0x7ffde7632000 out of bounds>) at ToolDAQ/ToolDAQFramework/src/ToolChain/ToolChain.cpp:81
#11 0x0000000000400ec4 in main (argc=2, argv=0x7ffde762e348) at src/main.cpp:11
===========================================================
ast0815 commented 4 years ago

The segfault only happens when I enable NHits. With only WCSimReader, PrepareSubSamples, DataOut, ReconDataOut, and ReconReset enabled it runs through the whole supernova file. And the memory requirement stays small, so at least that worked. ;)

ast0815 commented 4 years ago

And it works again, when I comment out the lines where NHits actually stores the triggers:

https://github.com/HKDAQ/TriggerApplication/blob/43ae90397b373fc5c38cc6f6b5d33116062aa71d/UserTools/nhits/nhits.cpp#L146-L150

tdealtry commented 4 years ago

Works for me using my leak WCSim branch, and this debug TriggerApplication branch. Tried both with both root5 and root6. That's for 20 MeV electrons For an file with subsevents (500 MeV mu-) it seg faults on the first with a subevent (both root5 and root6). So that explains why I didn't catch it before Will try fix it now...

tdealtry commented 4 years ago

Ok... so fixed the seg fault with a WCSim commit. Found that there is a big leak though, when using events with mutliple sub events Will try fix it now...

ast0815 commented 4 years ago

By sub-events, do you mean multiple triggers per SubSample?

tdealtry commented 4 years ago

Yep that's right. Sorry sub-events is the WCSim phrasing

Anyway, I thought I had a fix, but was running on the wrong files... 500 MeV muons still leak badly. My plan is now to bring forward implementing building our own WCSimRootEvent objects in DataOut (rather than hacking at the ones we take from WCSimReader). This is something we needed to do eventually anyway (e.g. when using RBU input). Not sure if I'll manage to complete it today

ast0815 commented 4 years ago

It seems like the leak in TriggerApplication is fixed now (at least fixed enough for my purposes = not so many events with multiple triggers), but now I see a massive leak in WCSim. A full SN simulation with >20k events occupied over 20 GB towards the end.

tdealtry commented 4 years ago

Closing this in favour of #60