AIDASoft / DD4hep

Detector Description Toolkit for High Energy Physics
http://dd4hep.cern.ch
GNU Lesser General Public License v3.0
50 stars 99 forks source link

Propagate cross section to the simulation #988

Closed obylund closed 1 year ago

obylund commented 2 years ago

It would be super useful if the cross section from the hepmc file could be propagated through dd4hep and stored in the resulting simulated file that you get when running npsim. At the moment I can not access the event-by-event cross section from the simulated file.

andresailer commented 2 years ago

Hi @obylund

a) can you provide us with a hepmc file that contains this cross-section info?

b) npsim, do you mean ddsim? Or what is npsim?

c-dilks commented 2 years ago

npsim is a modified ddsim, with some extra settings to control Cherenkov physics

wdconinc commented 2 years ago

@obylund Let me try to turn this into a report that the DD4hep developers will be able to debug without needing any EIC tools.

Steps to reproduce

We can use a publicly available hepmc file from the HepMC project itself (which should be well-formed):

wget --output-document example.hepmc https://gitlab.cern.ch/hepmc/HepMC3/-/raw/master/examples/RootIOExample/example.hepmc3

(Note: at some point we should add hepmc3 as an allowed extension for ddsim input files.)

There are several W lines in the input file (the first one is before the first event, so can be ignored):

jug_xl> wdconinc@menelaos:~$ grep -r ^W example.hepmc | head -n 10
W 0
W 1.0000000000000000000000e+00
W 1.0000000000000000000000e+00
W 1.0000000000000000000000e+00
W 1.0000000000000000000000e+00
W 1.0000000000000000000000e+00
W 1.0000000000000000000000e+00
W 1.0000000000000000000000e+00
W 1.0000000000000000000000e+00
W 1.0000000000000000000000e+00

We can use the standard SiD.xml geometry description curated by the DD4hep project:

ddsim --compactFile $DD4hepINSTALL/DDDetectors/compact/SiD.xml --inputFiles example.hepmc --outputFile example.edm4hep.root -N 10

Now, in the output file, we can look at the EventHeader.weight field:

jug_xl> wdconinc@menelaos:~$ root -l -q example.edm4hep.root -e 'events->Scan("EventHeader.weight")'
Attaching file example.edm4hep.root as _file0...
(TFile *) 0x55e0f04b13b0
***********************************
*    Row   * Instance * EventHead *
***********************************
*        0 *        0 *         0 *
*        1 *        0 *         0 *
*        2 *        0 *         0 *
*        3 *        0 *         0 *
*        4 *        0 *         0 *
*        5 *        0 *         0 *
*        6 *        0 *         0 *
*        7 *        0 *         0 *
*        8 *        0 *         0 *
*        9 *        0 *         0 *
***********************************
(long long) 10

Likewise, there are several attribute lines in the input hepmc3 file, which we would like to find in the output file evt_metadata tree (just copying the first event's preamble):

E 0 7 12
U GEV MM
W 1.0000000000000000000000e+00
A 0 GenCrossSection 2.64422551e+03 2.64422551e+03 -1 -1
A 0 GenPdfInfo 11 -11 9.97420767e-01 9.99999975e-01 9.18812775e+01 1.56824725e+01 2.82148362e+06 0 0
A 0 alphaQCD 0.129844
A 0 alphaQED 0.007818181
A 0 event_scale 91.88128
A 0 mpi 0
A 0 signal_process_id 221
A 0 signal_process_vertex 0

These A records should probably end up in either the intMap, floatMap, or stringMap entries, but...

jug_xl> wdconinc@menelaos:~$ root -l example.edm4hep.root
map<string,vector<int>> intMap
map<string,vector<float>> floatMap
map<string,vector<string>> stringMap
evt_metadata->SetBranchAddress("_intMap",&intMap)
evt_metadata->SetBranchAddress("_floatMap",&floatMap)
evt_metadata->SetBranchAddress("_stringMap",&stringMap)
evt_metadata->GetEntry(0)
intMap.size()
floatMap.size()
stringMap.size()
evt_metadata->Scan("@_intMap.size()")
evt_metadata->Scan("@_floatMap.size()")
evt_metadata->Scan("@_stringMap.size()")

which just returns zero sizes for the metadata maps.

Expected results:

We would like to find the event weights with values 1.0 in the event header, and we would like to find hepmc3 attributes propagated to the event metadata collections.

Actual results:

Events header weights are not filled from the input hepmc file weights, nor are the attributes propagated.

wdconinc commented 2 years ago

If I were to take a stab at this, https://github.com/AIDASoft/DD4hep/pull/321 would be my starting point.

wdconinc commented 2 years ago

Actually, https://github.com/AIDASoft/DD4hep/pull/636 rewrote most of that :-)

andresailer commented 2 years ago

Ok, similar to what is in the other hepmc3 file I got for the other issue https://github.com/AIDASoft/DD4hep/blob/55b82943bb48a40a30fac5fd96798a5dd2eb8457/DDTest/inputFiles/Pythia_output.hepmc#L5-L262

How should one store this?

A 0 GenCrossSection 2.64422551e+03 2.64422551e+03 -1 -1
A 0 GenPdfInfo 11 -11 9.97420767e-01 9.99999975e-01 9.18812775e+01 1.56824725e+01 2.82148362e+06 0 0

As a string with all the numbers, or as different entries for floats? GenCrossSection_0 ... GenCrosSection_3 etc.

wdconinc commented 2 years ago

Can we turn this into a set of square bracketed indexed vector entries, "GenCrossSection[0]"? That's less likely to be used by a primary key because [ ] are special characters in hepmc3 files.

andresailer commented 2 years ago

Yes, sounds good with the brackets

andresailer commented 2 years ago

everything is already implemented, at least for something to show up in the edm4hep output file...

https://github.com/AIDASoft/DD4hep/blob/0bbecc6862ceb4afe95e73dd3ce6a42525f18cc6/DDG4/hepmc/HepMC3FileReader.cpp#L39-L59

https://github.com/AIDASoft/DD4hep/blob/0bbecc6862ceb4afe95e73dd3ce6a42525f18cc6/DDG4/edm4hep/Geant4Output2EDM4hep.cpp#L56-L68

Except setValues doesn't seem to work... I need newer versions of podio/edm4hep in my devel stack, but using setValue() instead with a single entry makes something show up in the output file

wdconinc commented 2 years ago

everything is already implemented

That's what confused me too... So, the issue is in setValues()? That helps me drill down in it further too.

andresailer commented 2 years ago

Yeah, as far as I can tell, but was using a pretty old podio version. If you have a recent one, try changing to setValue with just some value and see if that gives you something...

tmadlener commented 2 years ago

To add to the discussion: setValues has been deprecated in AIDASoft/podio#262 but should still work and simply forward to the uniform setValue, see code.

wdconinc commented 2 years ago

Hopefully I'll find a few minutes to look into this soon :-)

obylund commented 2 years ago

Hi @wdconinc How is it going?

obylund commented 2 years ago

Hi @wdconinc Has there been any progress? Can I help? If yes, where do I get started and where is setValue defined?

wdconinc commented 2 years ago

I'm working on it. No results yet but other hepmc3 issues are being resolved as I go along...

obylund commented 1 year ago

Hello, @wdconinc What is the status now? Thanks.

wdconinc commented 1 year ago

Quick update (I had typed a long response here a week or so ago, but then my computer crashed and I haven't gotten it written up again yet).

andresailer commented 1 year ago

This was probably fixed in #1041 which also moved the EDM4hep output writer to use the podio frame

tmadlener commented 1 year ago

Brief update after #1041

If you read this into a Frame, you can access this information via podio::Frame::getParameter, e.g.

const auto& genXS = frame.getParameter<std::string>("GenCrossSection");

Can we turn this into a set of square bracketed indexed vector entries, "GenCrossSection[0]"? That's less likely to be used by a primary key because [ ] are special characters in hepmc3 files.

From what I can judge, this has not yet been done, right?

Also which of these values should actually end up in the EventHeader.weight?

MarkusFrankATcernch commented 1 year ago

Could someone please flash some insight how the event weight should be constructed from the event related header of HepMC ?

Thanks a lot!

wdconinc commented 1 year ago

As for weight, I think this should 'simply' be the number in the W field (in e.g. https://gitlab.cern.ch/hepmc/HepMC3/-/raw/master/examples/RootIOExample/example.hepmc3):

W 1.0000000000000000000000e+00

would translate into EventHeader.weight == 1 (this is a trivial example; default value for weight would be 1 as well).

OT note: I think we have a path identified to upgrading to 1.24 locally, so these recent developments are going to be easier to test soon.

tmadlener commented 1 year ago

OK. In that case it seems the hepmc3 reader in DDG4 does not properly read the weight yet. What I currently get int he output (via podio-dump) is the following:

$ podio-dump -d example.edm4hep.root
[...]

std::string parameters
Key                           Value 
--------------------------------------------------------------------------------
GenCrossSection               [2.64422551e+03 2.64422551e+03 -1 -1]
GenPdfInfo                    [11 -11 9.97420767e-01 9.99999975e-01 9.18812775e+01 1.56824725e+01 2.82148362e+06 0 0]
alphaQCD                      [0.129844]
alphaQED                      [0.007818181]
event_scale                   [91.88128]
mpi                           [0]
signal_process_id             [221]
signal_process_vertex         [0]

The problem seems to be that the HepMC3 reader only fills the attributes but the event weight is not one of them, and needs to be retrieved from the HepMC3::GenEvent via a different API:

https://gitlab.cern.ch/hepmc/HepMC3/-/blob/master/include/HepMC3/GenEvent.h#L261 vs https://gitlab.cern.ch/hepmc/HepMC3/-/blob/master/include/HepMC3/GenEvent.h#L98

from a technical point of view this should be solvable, the question becomes what to do with multiple event weights? Should we only put the first one into the EventHeader, but store all of the available ones into the parameters?

wdconinc commented 1 year ago

What I currently get int he output (via podio-dump) is the following [...]

Confirming that we now see the cross section propagated from hepmc3 into output EDM4hep files with dd4hep-1.24.

As for what to store in the EventHeader weight, I would also think the first weight is best and all of them in the parameters.