key4hep / k4MarlinWrapper

GaudifyMarlinProcessors
Apache License 2.0
2 stars 19 forks source link

Memory leak in LCIO to EDM4hep conversion #96

Closed tmadlener closed 1 year ago

tmadlener commented 1 year ago

Disclaimer: This is again one of those things where I am not sure whether the real issue is actually here. It is possible that the actual problem is in k4FWCore or in k4LCIOReader, which are obviously also involved here.

When using an LCIO2EDM4hepTool attached to a MarlinProcessorWrapper there is a quite significant memory leak (apart from the one fixed in #95). I have attached a plot here, which is produced with the minimal reproducer further below:

mem_leak

The two curves, represent a run with and without actually storing the converted EDM4hep collections to file. Ignoring the kink from 10 to 1000 events in the "with output" case, the memory leak is somewhere in the area of slightly below 100 kb / event, for this reproducer.

The reproducer is a sort of ad-hoc "standalone" LCIO to EDM4hep converter. The input file is an ILD DST file in this case:

from Gaudi.Configuration import *

from Configurables import LcioEvent, MarlinProcessorWrapper
from Configurables import PodioOutput, Lcio2EDM4hepTool
from Configurables import k4DataSvc

podioEvtSvc = k4DataSvc("EventDataSvc")

edmConvTool = Lcio2EDM4hepTool("LCIO2EDM4hep")
edmConvTool.convertAll = False
edmConvTool.collNameMapping = {
    "MCParticlesSkimmed": "MCParticlesSkimmed",
    "MarlinTrkTracks": "MarlinTrkTracks",
    "PandoraClusters": "PandoraClusters",
    "PandoraPFOs": "PandoraPFOs",
}

algList = []

read = LcioEvent()
read.OutputLevel = DEBUG
read.Files = [
    "rv02-02-01.sv02-02-01.mILD_l5_o2_v02.E250-SetA.I402004.Pe2e2h.eR.pL.n000.d_dstm_15673_0.slcio"
]
algList.append(read)

edm4hepOutput = PodioOutput(
    "EDM4hepOutput",
    filename="rv02-02-01.sv02-02-01.mILD_l5_o2_v02.E250-SetA.I402004.Pe2e2h.eR.pL.n000.d_dstm_15673_0.root",
)
edm4hepOutput.outputCommands = ["keep *"]

MyStatusmonitor = MarlinProcessorWrapper("MyStatusmonitor")
MyStatusmonitor.OutputLevel = INFO
MyStatusmonitor.ProcessorType = "Statusmonitor"
MyStatusmonitor.Parameters = {"HowOften": ["1"]}
MyStatusmonitor.Lcio2EDM4hepTool = edmConvTool  # Not attaching the converter fixes the leak!

algList.append(MyStatusmonitor)
algList.append(edm4hepOutput)  # remove for "no output" case above

from Configurables import ApplicationMgr

ApplicationMgr(
    TopAlg=algList, EvtSel="NONE", EvtMax=2000, ExtSvc=[podioEvtSvc], OutputLevel=INFO
)
andresailer commented 1 year ago

any chance you can run valgrind on top of that?

tmadlener commented 1 year ago

Still have to try that. Haven't had the time yet.

tmadlener commented 1 year ago

valgrind --tool=memcheck --leak-check=full doesn't seem to find anything. I haven't yet been able to build things with address sanitizer, because the build step fails, because something seems to actually be run, leading to the following problem

==1325==ASan runtime does not come first in initial library list; you should either link runtime to your application or manually preload it with LD_PRELOAD.

Could it be that something just accumulates things and then cleans up at the very end?

andresailer commented 1 year ago

Is the LCIO table of contents (?) just growing until the file is closed and it can be written?

Edit: Wait, there is no LCIO output file?

tmadlener commented 1 year ago

Edit: Wait, there is no LCIO output file?

Correct. In principle, not even the EDM4hep output is necessary to have the issue. Simply attaching the converter to a wrapped processor is enough. Adding an LCIO output seems to not change anything at all, apart from a slightly higher memory baseline.

tmadlener commented 1 year ago

Is the LCIO table of contents (?) just growing until the file is closed and it can be written?

There seems to be one likely culprit for that: https://github.com/key4hep/k4MarlinWrapper/blob/6f98cf955f1df521e9a7009874bdbb4d15381a71/k4MarlinWrapper/k4MarlinWrapper/converters/Lcio2EDM4hep.h#L34

However, I cannot find where it is even populated.

andresailer commented 1 year ago

I think that m_dataHandlesMap is populated in k4LCIOReader? But it shouldn't grow once all the collection names are in there.

andresailer commented 1 year ago

Could you try the massif tool? https://valgrind.org/info/tools.html#massif

tmadlener commented 1 year ago

Either, I cannot run valgrind with massif, or that still reports nothing

tmadlener commented 1 year ago

Maybe related to key4hep/k4FWCore#82 ?

jmcarcell commented 1 year ago

With the inclusion of the new converter (https://github.com/key4hep/k4MarlinWrapper/pull/114) this issue isn't relevant anymore