key4hep / EDM4hep

Generic event data model for HEP collider experiments
https://cern.ch/edm4hep
Apache License 2.0
24 stars 35 forks source link

podio-dump takes very long and prints errors #312

Open Victor-Schwan opened 1 month ago

Victor-Schwan commented 1 month ago

key4hep version: nightlies from 2024-06-06 but also occurred in other versions from the last month

Frame categories in this file: Name Entries


metadata 1 events 3 configuration_metadata 1 ################################### events: 0 #################################### Error in : data member with index 0 is not found in class tuple<vector,default_delete<vector > > Error in : Cannot find data member # 0 of class tuple<vector,default_delete<vector > > for parent edm4hep::MCRecoTrackerAssociationCollection! Error in : data member with index 1 is not found in class tuple<vector,default_delete<vector > > Error in : Cannot find data member # 1 of class tuple<vector,default_delete<vector > > for parent edm4hep::MCRecoTrackerAssociationCollection! Collections: Name ValueType Size ID


ClupatraTracks edm4hep::Track 1 264d1493 ClupatraTrackSegments edm4hep::Track 0 c80f6454 DebugHits edm4hep::TrackerHitPlane 0 c498ee0d


- Problem: except for the thrown errors, the dumping takes about 90 seconds
 - Goal: Fix issue
tmadlener commented 1 month ago

Are the 90 seconds repeatable? I.e. does it still happen on the second time around? I get quite a bit of delay the first time due to the cvmfs cache population, but it gtes down to (a still very slow) roughly 20 seconds on subsequent tries.

jmcarcell commented 1 month ago

On lxplus I imagine... One way of helping with this is not to build with debug symbols, I think the difference is noticeable.

tmadlener commented 1 month ago

Has the file been written with the exact same version of EDM4hep that has been used for reading? While, I see the same error when running podio-dump on it using the nightlies and my local stack, I cannot reproduce this if I produce a file and then dump it again in one stack

from podio import root_io, frame
from edm4hep import (
    TrackerHitPlaneCollection,
    MCRecoTrackerAssociationCollection,
    SimTrackerHitCollection,
)

hits = TrackerHitPlaneCollection()
for _ in range(3):
    hits.create()

simhits = SimTrackerHitCollection()
for _ in range(3):
    simhits.create()

rels = MCRecoTrackerAssociationCollection()
for i in range(3):
    rel = rels.create()
    rel.setWeight(i)
    rel.setSim(simhits[i])
    rel.setRec(hits[2 - i])

event = frame.Frame()
event.put(hits, "hits")
event.put(simhits, "simhits")
event.put(rels, "relations")

writer = root_io.Writer("tracker_hit_repro.edm4hep.root")
writer.write_frame(event, "events")

which I think should be a minimal reproducer (judging from the contents of the errors). But this just works for me:

$ podio-dump tracker_hit_repro.edm4hep.root 
input file: tracker_hit_repro.edm4hep.root

datamodel model definitions stored in this file: edm4hep

Frame categories in this file:
Name      Entries
------  ---------
events          1
################################### events: 0 ####################################
Collections:
Name       ValueType                            Size  ID
---------  ---------------------------------  ------  --------
hits       edm4hep::TrackerHitPlane                3  fb2c5a48
relations  edm4hep::MCRecoTrackerAssociation       3  8eb44c8d
simhits    edm4hep::SimTrackerHit                  3  25874f6b

Parameters:
Name    Type    Elements
------  ------  ----------
Victor-Schwan commented 1 month ago

Using today's k4hep nightlies, I create and read (using podio-dump) an edm4hep file created with my steering scripts and one with your minimal reproducer script. The error does not occur using the minimal reproducer but still with my scripts. In my understanding, this excludes different versions of edm4hep as the cause.

tmadlener commented 1 month ago

Thanks for testing. In that case it indeed looks like something that is not yet caught by the minimal reproducer.