HEP-FCC / FCCAnalyses

Common analysis framework for the Future Circular Collider
https://hep-fcc.github.io/FCCAnalyses/
24 stars 117 forks source link

RDF crash when running against December 2023 version of package #401

Open asciandr opened 1 month ago

asciandr commented 1 month ago

Dear developers,

when running the following script:

/afs/cern.ch/work/s/selvaggi/public/analysis_inference_IDEA_240_andrea_12_09_2024_FSR_studies_IDEA_truthPID_7labels.py

as:

fccanalysis run /afs/cern.ch/work/s/selvaggi/private/FCCSW-ee/FCCAnalyses_winter2023/examples/FCCee/weaver/analysis_inference_IDEA_240_andrea_12_09_2024_FSR_studies_IDEA_truthPID_7labels.py --output /eos/experiment/fcc/ee/jet_flavour_tagging/winter2023/models//IDEA_240_andrea_12_09_2024/FSR_studies_IDEA_truthPID_7labels/test_data_240/wzp6_ee_nunuH_Hss_ecm240/events.root --files-list /eos/experiment/fcc/ee/generation/DelphesEvents/winter2023/IDEA//wzp6_ee_nunuH_Hss_ecm240/events_088558627.root --nevents 20000

against this version of the package: https://github.com/asciandr/FCCAnalyses/tree/truthPID_5labels_addlowerPCutOnPhotonPFCand

which is commit 045eff902c83567ca566ace92fdb8676c38b642e plus a few changes to compute and store the truth PID for PF candidates, we get the following crash:

RDataFrame::Run: event loop was interrupted
RDataFrame::Run: event loop was interrupted
RDataFrame::Run: event loop was interrupted
----> ERROR: During the execution of the analysis file exception occurred:
             Template method resolution failed:
               none of the 3 overloaded methods succeeded. Full details:
               ROOT::RDF::RResultPtr<ROOT::RDF::RInterface<ROOT::Detail::RDF::RLoopManager,void> > ROOT::RDF::RInterface<ROOT::Detail::RDF::RRange<ROOT::Detail::RDF::RLoopManager>,void>::Snapshot(basic_string_view<char,char_traits<char> > treename, basic_string_view<char,char_traits<char> > filename, initializer_list<string> columnList, const ROOT::RDF::RSnapshotOptions& options = ROOT::RDF::RSnapshotOptions()) =>
                 TypeError: could not convert argument 3
               ROOT::RDF::RResultPtr<ROOT::RDF::RInterface<ROOT::Detail::RDF::RLoopManager,void> > ROOT::RDF::RInterface<ROOT::Detail::RDF::RRange<ROOT::Detail::RDF::RLoopManager>,void>::Snapshot(basic_string_view<char,char_traits<char> > treename, basic_string_view<char,char_traits<char> > filename, const vector<string>& columnList, const ROOT::RDF::RSnapshotOptions& options = ROOT::RDF::RSnapshotOptions()) =>
                 out_of_range: RVecN
               ROOT::RDF::RResultPtr<ROOT::RDF::RInterface<ROOT::Detail::RDF::RLoopManager,void> > ROOT::RDF::RInterface<ROOT::Detail::RDF::RRange<ROOT::Detail::RDF::RLoopManager>,void>::Snapshot(basic_string_view<char,char_traits<char> > treename, basic_string_view<char,char_traits<char> > filename, basic_string_view<char,char_traits<char> > columnNameRegexp = "", const ROOT::RDF::RSnapshotOptions& options = ROOT::RDF::RSnapshotOptions()) =>
                 TypeError: could not convert argument 3
               ROOT::RDF::RResultPtr<ROOT::RDF::RInterface<ROOT::Detail::RDF::RLoopManager,void> > ROOT::RDF::RInterface<ROOT::Detail::RDF::RRange<ROOT::Detail::RDF::RLoopManager>,void>::Snapshot(basic_string_view<char,char_traits<char> > treename, basic_string_view<char,char_traits<char> > filename, const vector<string>& columnList, const ROOT::RDF::RSnapshotOptions& options = ROOT::RDF::RSnapshotOptions()) =>
                 out_of_range: RVecN
               Failed to instantiate "Snapshot(std::string,std::string,std::vector<string>*)"
               ROOT::RDF::RResultPtr<ROOT::RDF::RInterface<ROOT::Detail::RDF::RLoopManager,void> > ROOT::RDF::RInterface<ROOT::Detail::RDF::RRange<ROOT::Detail::RDF::RLoopManager>,void>::Snapshot(basic_string_view<char,char_traits<char> > treename, basic_string_view<char,char_traits<char> > filename, const vector<string>& columnList, const ROOT::RDF::RSnapshotOptions& options = ROOT::RDF::RSnapshotOptions()) =>
                 out_of_range: RVecN

This error occurs in some rare events but cause the code to crash. Could you please help with/advise on how to fix the issue?

Thanks in advance, Andrea, George, Iza and Michele

asciandr commented 1 month ago

@kjvbrt @selvaggi

kjvbrt commented 1 month ago

Hi @asciandr, what is the stack you are using? I'm not able to reproduce your error, instead I'm getting another one:

...
----> INFO: Creating dataframe object from files:
            - root://eospublic.cern.ch//eos/experiment/fcc/ee/generation/DelphesEvents/winter2023/IDEA//wzp6_ee_nunuH_Hss_ecm240/events_088558627.root

----> INFO: Number of local events: 8,052
----> INFO: Output file path:
            outputs/inference/events.root
ERROR: pfcand_truthPID variables was not defined.

(the error is the same for latest release and also 2024-03-10 release)

Also, how are your compiling your local version? What is the output of which fccanalysis?

asciandr commented 1 month ago

Hi @kjvbrt , thanks a lot for the prompt reply! Indeed, you should be able to reproduce the issue by setting up the winter2023 setup and by recompiling against my branch. @selvaggi 's command above leads to the aforementioned crash. Could you please confirm that you can reproduce the issue? Thanks a lot for your help!

kjvbrt commented 1 month ago

Hi @asciandr, I'm still not able to reproduce your error. It is not possible to build your version of FCCAnalyses in /cvmfs/sw.hsf.org/spackages6/key4hep-stack/2022-12-23/x86_64-centos7-gcc11.2.0-opt/ll3gi/setup.sh (winter2023) stack on lxplus9.

Can you post list of commands how are you compiling FCCAnalyses and which OS version you are using? It would be also great if you can provide output from which fccanalysis command. :)