Closed tontyoutoure closed 1 year ago
after a little digging it seems to me that the problem is around GateActorManager.cc:304
. The comment there said:
// Remove attached SD by replacing with a deactivated clone SD
However, it clearly did not remove the attached sd. Instead, every hit is processed twice, once by the original CrystalSD, once by the mutifuctional SD.
also, I checked the volume's sd before and after "replacing", volume->GetLogicalVolume()->GetSensitiveDetector()
remains the same address.
edit: Hits are not processed twice. Just singles count are doubled in the summary.
At the time of PR #354 , this seems to be a smart solution. Now it has become the issue itself. Hopefully @cpommranz and @dsarrut can look into this. I'm happy to provide a barebone replication of this issue by then. Without your people's help, I will have to dig deeper to be comfortable enough to provide a fix.
Yes, there is also memory leak. The hero seems to turn into dragon now.
ouch ;) Thanks for your investigation! We should look at this. However, to be fair, we are committed to opengate version 10 right now (python version), and have very little bandwidth to dig into this., so it may take a bit of time.
@dsarrut Yes I suppose, looking forward to the python ver! I will see if I can fix it myself. However some (desynchronized) Q/A will be much appreciated.
Sur, I can try to help no problem!
Well, after a afternoon of digging, I confirmed that, when one tries to tie an actor to a volume that contains a CrystalSD, the following will happen:
G4SDManager
(not sure what it does), but the original SD is still there and tied to the logical volume. Later the original SD is put into GateMultiSensitiveDetector
and count hits there.GateSinglesDigitizer
, which is added to GateDigitizerMgr
. The summary output will iterate list of singles digitizer to get output list, thus two hit line&two single line for each crystal sd.Deleting the cloned SD will resolve the summary issue. However the memory leak continues, as long as an actor is tied to a CrystalSD. I think address this issue in the documentation and remove the cloned SD (it did not fix the mem leak anyway) could be a temporary fix.
@dsarrut ok my question is, how did Gate recollect the hit collection (G4HCofThisEvent
) normally? It seems (through valgrind) that they are the primary cause of memory leak. Normally Geant4 will ask each SD to handle their own recollection through SD::EndOfEvent(HCE)
, but GateCrystalSD
did not implement this method, I suppose you guys handle the memory recollect somewhere else.
well, the cloned SD do fix the memory leak of HCE, however it introduces some other memory leak, and the previous metiond summary bug:
==1659032== 23,986,136 (23,980,896 direct, 5,240 indirect) bytes in 999,204 blocks are definitely lost in loss record 22,380 of 22,380
==1659032== at 0x483BE63: operator new(unsigned long) (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==1659032== by 0x382E49: G4TDigiCollection<GateDigi>::G4TDigiCollection(G4String, G4String) (G4TDigiCollection.hh:151)
==1659032== by 0x42F998: GateDigitizerInitializationModule::Digitize() (GateDigitizerInitializationModule.cc:54)
==1659032== by 0x435C28: GateDigitizerMgr::RunDigitizers() (GateDigitizerMgr.cc:433)
==1659032== by 0x388474: GateAnalysis::RecordEndOfEvent(G4Event const*) (GateAnalysis.cc:481)
==1659032== by 0x5BC072: GateOutputMgr::RecordEndOfEvent(G4Event const*) (GateOutputMgr.cc:202)
==1659032== by 0x3723B9: GateEventAction::EndOfEventAction(G4Event const*) (GateActions.cc:204)
==1659032== by 0x59A21CC: G4EventManager::DoProcessing(G4Event*) (in /home/likun/Software/Geant4/install/lib/libG4event.so)
==1659032== by 0x58FD756: G4RunManager::DoEventLoop(int, char const*, int) (in /home/likun/Software/Geant4/install/lib/libG4run.so)
==1659032== by 0x58FB2E1: G4RunManager::BeamOn(int, char const*, int) (in /home/likun/Software/Geant4/install/lib/libG4run.so)
==1659032== by 0x7714C5: GateApplicationMgr::StartDAQ() (GateApplicationMgr.cc:411)
==1659032== by 0x776D9E: GateApplicationMgrMessenger::SetNewValue(G4UIcommand*, G4String) (GateApplicationMgrMessenger.cc:187)
BTW, valgrind log for memory leak without cloned SD is:
==1609725== 24,003,792 bytes in 1,000,158 blocks are definitely lost in loss record 22,276 of 22,277
==1609725== at 0x483BE63: operator new(unsigned long) (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==1609725== by 0x35CCD7: G4THitsCollection<GateHit>::G4THitsCollection(G4String, G4String) (G4THitsCollection.hh:152)
==1609725== by 0x40839A: GateCrystalSD::Initialize(G4HCofThisEvent*) (GateCrystalSD.cc:108)
==1609725== by 0x764AD6C: G4SDStructure::Initialize(G4HCofThisEvent*) (in /home/likun/Software/Geant4/install/lib/libG4digits_hits.so)
==1609725== by 0x76490DE: G4SDManager::PrepareNewEvent() (in /home/likun/Software/Geant4/install/lib/libG4digits_hits.so)
==1609725== by 0x59A1FAA: G4EventManager::DoProcessing(G4Event*) (in /home/likun/Software/Geant4/install/lib/libG4event.so)
==1609725== by 0x58FD756: G4RunManager::DoEventLoop(int, char const*, int) (in /home/likun/Software/Geant4/install/lib/libG4run.so)
==1609725== by 0x58FB2E1: G4RunManager::BeamOn(int, char const*, int) (in /home/likun/Software/Geant4/install/lib/libG4run.so)
==1609725== by 0x771437: GateApplicationMgr::StartDAQ() (GateApplicationMgr.cc:411)
==1609725== by 0x776D10: GateApplicationMgrMessenger::SetNewValue(G4UIcommand*, G4String) (GateApplicationMgrMessenger.cc:187)
==1609725== by 0x7CC7159: G4UIcommand::DoIt(G4String) (in /home/likun/Software/Geant4/install/lib/libG4intercoms.so)
==1609725== by 0x7CE5E4A: G4UImanager::ApplyCommand(char const*) (in /home/likun/Software/Geant4/install/lib/libG4intercoms.so)
==1609725==
==1609725== 31,004,650 bytes in 1,000,150 blocks are definitely lost in loss record 22,277 of 22,277
==1609725== at 0x483BE63: operator new(unsigned long) (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==1609725== by 0x1A82929E: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_assign(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28)
==1609725== by 0x7653B1D: G4VHitsCollection::G4VHitsCollection(G4String, G4String) (in /home/likun/Software/Geant4/install/lib/libG4digits_hits.so)
==1609725== by 0x765374D: G4HitsCollection::G4HitsCollection(G4String, G4String) (in /home/likun/Software/Geant4/install/lib/libG4digits_hits.so)
==1609725== by 0x35CCA3: G4THitsCollection<GateHit>::G4THitsCollection(G4String, G4String) (G4THitsCollection.hh:150)
==1609725== by 0x40839A: GateCrystalSD::Initialize(G4HCofThisEvent*) (GateCrystalSD.cc:108)
==1609725== by 0x764AD6C: G4SDStructure::Initialize(G4HCofThisEvent*) (in /home/likun/Software/Geant4/install/lib/libG4digits_hits.so)
==1609725== by 0x76490DE: G4SDManager::PrepareNewEvent() (in /home/likun/Software/Geant4/install/lib/libG4digits_hits.so)
==1609725== by 0x59A1FAA: G4EventManager::DoProcessing(G4Event*) (in /home/likun/Software/Geant4/install/lib/libG4event.so)
==1609725== by 0x58FD756: G4RunManager::DoEventLoop(int, char const*, int) (in /home/likun/Software/Geant4/install/lib/libG4run.so)
==1609725== by 0x58FB2E1: G4RunManager::BeamOn(int, char const*, int) (in /home/likun/Software/Geant4/install/lib/libG4run.so)
==1609725== by 0x771437: GateApplicationMgr::StartDAQ() (GateApplicationMgr.cc:411)
Thanks a lot for the report and the investigation @tontyoutoure! If I remember correctly, the cloned SD was only used to remove the original SD from the G4SDManager
, before the original SD is added to the GateMultiSensitiveDetector
, as you described in 1). Unfortunately, there is currently no better way to remove an SD from the SD manager, so this workaround is necessary, and it is the recommended way according to the Geant4 devs (see #354). The cloned SD is deactivated right away and should not be used further. In 2) you describe, that the cloned SD is used later, which might be part of the problem. I will have a closer look.
I had a quick look in the recent git history of the G4SDManager
and didn't find anything that could explain the observed behavior, so I assume that the problem is indeed caused somewhere in the Gate code.
@cpommranz Gate recently add something called Digitizer Manager and I think that should be what introduces the problem. I wonder if it's possible to create a dummy SD class without digitizer bounding to it, and use it to replace the original SD.
Meanwhile of course, the new memory leak should also be addressed, since the cloned SD just messes up the summary.
The empty SD solution works like a charm, fix all the problems (summary, mem leak). Hope @dsarrut can allocate some time and merge it soon.
Great, thank you very much both of you! The only test that does not pass is test7 but this is expected, this test will be removed soon. Thanks for the effort!
Describe the bug Attach actor to "world" causing summary to malfuntion.
Desktop (please complete the following information):
Minimal example Pretty easy to replicate, just attach an actor to world.
Expected behavior that summary works fine
Additional context I will try to fix it and submit a PR.