key4hep / k4MarlinWrapper

GaudifyMarlinProcessors
Apache License 2.0
2 stars 19 forks source link

Issues when running CLIC reconstruction on EDM4Hep input #110

Closed Zehvogel closed 1 year ago

Zehvogel commented 1 year ago

TL;DR: event and run number not set/transferred during edm4hep to lcio conversion? /TL;DR

We received a bug report via email from Emmanuel Perez (original message below) about inconsistencies in the CLIC reconstruction between using Marlin+LCIO, Gaudi+LCIO and Gaudi+EDM4Hep. He noticed a shift in the track pulls when using the edm4hep input, mainly in tanLambda, z0 and d0. I was able to reproduce this following his description.

Digging into this we noticed that there must be an issue already during digitisation as the hit position smearing is non-gaussian.

with edm4hep input:

with slcio input:

which is weird as this part of the code looks to be independent from the input:

void DDPlanarDigiProcessor::init() { 
  //...
  _rng = gsl_rng_alloc(gsl_rng_ranlxs2);
  Global::EVENTSEEDER->registerProcessor(this);
  //...
}

void DDPlanarDigiProcessor::processEvent( LCEvent * evt ) { 

  gsl_rng_set( _rng, Global::EVENTSEEDER->getSeed(this) ) ;   
  //...

    //later in a loop over the hits:
    double uSmear  = gsl_ran_gaussian( _rng, resU ) ;
    _h[hu]->Fill(  uSmear / resU ) ; 
}

However, there is a dependency on the event during the refreshment of the seeds happening for every event and as far as I can tell the event and run numbers are not set in the edm4hep to lcio conversion?

void ProcessorEventSeeder::refreshSeeds( LCEvent * evt ) {
  //...
  // get hashed seed using jenkins_hash
  unsigned int seed = 0 ; // initial state
  unsigned int eventNumber = evt->getEventNumber() ;
  unsigned int runNumber = evt->getRunNumber() ;

  unsigned char * c = (unsigned char *) &eventNumber ;
  seed = jenkins_hash( c, sizeof eventNumber, seed) ;

  c = (unsigned char *) &runNumber ;
  seed = jenkins_hash( c, sizeof runNumber, seed) ;

  c = (unsigned char *) &_global_seed ;
  seed = jenkins_hash( c, sizeof _global_seed, seed) ;
  //...
  srand( seed );
  //...
}

I will check that next...

Dear Thomas and André,

Thanks for setting up the nice doc for the k4MarlinWrappers : https://github.com/key4hep/k4MarlinWrapper/tree/master/doc/starterkit/k4MarlinWrapperCLIC

I see issues with the "edm4hep workflow" - produce an edm4hep file with ddsim, and reconstruct it with Gaudi via the wrappers.

I have simulated 500 Z -> mumu events (at 91 GeV), with ddsim, with the CLIC_o3_v14.xml model. I ran ddsim twice, first to produce an slcio file, and second, to produce an edm4hep file. Then, I have run the reconstruction, following the instructions. I have reconstructed the slcio file both with Marlin and with Gaudi. The reconstruction runs the "TrackChecker" module (*), which produces histograms of the track quantities, in particular the pulls of the track parameters. The full set of commands is given below (**).

The attached pdf shows these pulls of the track parameters, for the three workflows. The corresponding root files are on lxplus in ~eperez/public/for_reco

The pulls are in general a bit large, even in the "historical" workflow (slcio + Marlin), especially for the omega parameter, but maybe that's OK ? What worries me are the shifts that are observed in the plots coming from the edm4hep workflow, which are most pronounced in tanLambda, d0 and in particular z0. Since these shifts are not seen when I reconstruct the slcio events with gaudi, I assume that the problem lies in the edm4hep to lcio conversion.

Do you have an idea fo what is going wrong ?

Thanks a lot and cheers, E.

(*) for the edm4hep workflow: I have had to update the collection name from "MCParticle" to "MCParticles" in the definition of MyTrackChecker (and of MyClicEfficiencyCalculator), in the example that you indicate in the k4MarlinWrapper repository, test/gaudi_opts/clicRec_e4h_input.py

(**) source /cvmfs/sw.hsf.org/key4hep/setup.sh

edm4hep workflow: ddsim --compactFile $LCGEO/CLIC/compact/CLIC_o3_v14/CLIC_o3_v14.xml --outputFile Zmumu_edm4hep.root --steeringFile clic_steer.py --inputFiles /eos/experiment/fcc/ee/generation/hepmc/p8_ee_Zmumu_ecm91/events_noVtxSmear.hepmc --numberOfEvents 500 k4run clicRec_e4h_input.py

slcio + marlin : same ddsim command but with --outputFile Zmumu.slcio Marlin clicReconstruction.xml --InitDD4hep.DD4hepXMLFile=$LCGEO/CLIC/compact/CLIC_o3_v14/CLIC_o3_v14.xml --global.LCIOInputFiles=Zmumu.slcio --global.MaxRecordNumber=500

slcio + gaudi : k4run clicReconstruction.py

The various .py and .xml files I used are on lxlpus, in ~eperez/public/for_reco

tmadlener commented 1 year ago

I can tell the event and run numbers are not set in the edm4hep to lcio conversion?

That is correct. I am also not entirely sure that information is actually available from edm4hep currently. There is an edm4hep::EventHeader but I am not sure it is filled consistently. DDG4 does fill it, not sure how many others do.