umd-lhcb / MuonBDTPid

Muon PID with a uboost BDT (in ROOT 5). Also include code for PID efficiency studies
0 stars 0 forks source link

Segmentation fault with Lucia's script #4

Closed yipengsun closed 3 years ago

yipengsun commented 3 years ago

With basically no adaption, I got the following error with Castelao/v10:

audiSequencer/SeqMatchLbLcMu_Pbar, GaudiSequencer/SeqMatchB_Jpsi_MuP, GaudiSequencer/SeqMatchLam0LL_VHPT_Pbar, GaudiSequencer/SeqMatchDSt_PiM, GaudiSequencer/SeqMatchLam0LL_P, GaudiSequencer/SeqMatchB_Jpsi_EM, GaudiSequencer/SeqMatchKSLL_PiM, GaudiSequencer/SeqMatchKSLL_PiP, GaudiSequencer/SeqMatchLam0LL_VHPT_P, GaudiSequencer/SeqMatchB_Jpsi_EP, GaudiSequencer/SeqMatchLam0LL_HPT_P, GaudiSequencer/SeqMatchLam0LL_Pbar, GaudiSequencer/SeqMatchJpsinopt_MuP, GaudiSequencer/SeqMatchJpsinopt_MuM, GaudiSequencer/SeqMatchDSt_KP, GaudiSequencer/SeqMatchLam0LL_P_isMuon, GaudiSequencer/SeqMatchJpsi_MuP, GaudiSequencer/SeqMatchDsPhi_KP, GaudiSequencer/SeqMatchJpsi_MuM, GaudiSequencer/SeqMatchB_Jpsi_MuM, GaudiSequencer/SeqMatchLam0LL_Pbar_isMuon, GaudiSequencer/SeqMatchDSt_KM, GaudiSequencer/SeqMatchDsPhi_KM, DecayTreeTuple/DSt_PiPTuple, DecayTreeTuple/LbLcMu_PTuple, DecayTreeTuple/Lam0LL_HPT_PbarTuple, DecayTreeTuple/LbLcMu_PbarTuple, DecayTreeTuple/B_Jpsi_MuPTuple, DecayTreeTuple/Lam0LL_VHPT_PbarTuple, DecayTreeTuple/DSt_PiMTuple, DecayTreeTuple/Lam0LL_PTuple, DecayTreeTuple/B_Jpsi_EMTuple, DecayTreeTuple/KSLL_PiMTuple, DecayTreeTuple/KSLL_PiPTuple, DecayTreeTuple/Lam0LL_VHPT_PTuple, DecayTreeTuple/B_Jpsi_EPTuple, DecayTreeTuple/Lam0LL_HPT_PTuple, DecayTreeTuple/Lam0LL_PbarTuple, DecayTreeTuple/Jpsinopt_MuPTuple, DecayTreeTuple/Jpsinopt_MuMTuple, DecayTreeTuple/DSt_KPTuple, DecayTreeTuple/Lam0LL_P_isMuonTuple, DecayTreeTuple/Jpsi_MuPTuple, DecayTreeTuple/DsPhi_KPTuple, DecayTreeTuple/Jpsi_MuMTuple, DecayTreeTuple/B_Jpsi_MuMTuple, DecayTreeTuple/Lam0LL_Pbar_isMuonTuple, DecayTreeTuple/DSt_KMTuple, DecayTreeTuple/DsPhi_KMTuple
fs_stdMuons                                                    INFO Member list: LoKi::VoidFilter/SELECT:Phys/StdAllNoPIDsMuons
In file included from LoKiNumbersDict dictionary payload:60:
In file included from /opt/lhcb/lhcb/LHCB/LHCB_v50r6/InstallArea/x86_64-centos7-gcc9-opt/include/LoKi/LoKiNumbers_dct.h:23:
In file included from /opt/lhcb/lhcb/LHCB/LHCB_v50r6/InstallArea/x86_64-centos7-gcc9-opt/include/LoKi/BasicFunctors.h:19:
In file included from /opt/lhcb/lhcb/LHCB/LHCB_v50r6/InstallArea/x86_64-centos7-gcc9-opt/include/LoKi/Functor.h:27:
In file included from /opt/lhcb/lhcb/LHCB/LHCB_v50r6/InstallArea/x86_64-centos7-gcc9-opt/include/LoKi/AuxFunBase.h:31:
In file included from /opt/lhcb/lhcb/LHCB/LHCB_v50r6/InstallArea/x86_64-centos7-gcc9-opt/include/LoKi/ILoKiSvc.h:19:
In file included from /opt/lhcb/lhcb/GAUDI/GAUDI_v32r2/InstallArea/x86_64-centos7-gcc9-opt/include/GaudiKernel/IIncidentListener.h:6:
In file included from /opt/lhcb/lhcb/GAUDI/GAUDI_v32r2/InstallArea/x86_64-centos7-gcc9-opt/include/GaudiKernel/Incident.h:5:
/opt/lhcb/lhcb/GAUDI/GAUDI_v32r2/InstallArea/x86_64-centos7-gcc9-opt/include/GaudiKernel/EventContext.h:5:10: fatal error: 'any' file not found
#include <any>
         ^~~~~

 *** Break *** segmentation violation
yipengsun commented 3 years ago

Here's the problematic file. Indeed it tries to include <any>.

yipengsun commented 3 years ago

Well, the script get another segmentation fault on lxplus:

run.py", start=start
entry=257, globals=globals
entry=0x7f33576c9168, locals=locals
entry=0x7f33576c9168, closeit=closeit
entry=1, flags=0x7ffc159f37cc) at /mnt/build/jenkins/workspace/lcg_release_tar/BUILDTYPE/Release/COMPILER/gcc9binutils/LABEL/centos7/build/externals/Python-2.7.16/src/Python/2.7.16/Python/pythonrun.c:1371
#49 0x00007f33572679ac in PyRun_SimpleFileExFlags (fp=fp
entry=0xe7c140, filename=0x7ffc159f5b13 "/cvmfs/lhcb.cern.ch/lib/lhcb/GAUDI/GAUDI_v32r2/InstallArea/x86_64-centos7-gcc9-opt/scripts/gaudirun.py", closeit=closeit
entry=1, flags=flags
entry=0x7ffc159f37cc) at /mnt/build/jenkins/workspace/lcg_release_tar/BUILDTYPE/Release/COMPILER/gcc9binutils/LABEL/centos7/build/externals/Python-2.7.16/src/Python/2.7.16/Python/pythonrun.c:957
#50 0x00007f335726805c in PyRun_AnyFileExFlags (fp=fp
entry=0xe7c140, filename=<optimized out>, closeit=closeit
entry=1, flags=flags
entry=0x7ffc159f37cc) at /mnt/build/jenkins/workspace/lcg_release_tar/BUILDTYPE/Release/COMPILER/gcc9binutils/LABEL/centos7/build/externals/Python-2.7.16/src/Python/2.7.16/Python/pythonrun.c:761
#51 0x00007f335727ac1f in Py_Main (argc=<optimized out>, argv=<optimized out>) at /mnt/build/jenkins/workspace/lcg_release_tar/BUILDTYPE/Release/COMPILER/gcc9binutils/LABEL/centos7/build/externals/Python-2.7.16/src/Python/2.7.16/Modules/main.c:641
#52 0x00007f335646f555 in __libc_start_main () from /lib64/libc.so.6
#53 0x00000000004006be in _start ()
yipengsun commented 3 years ago

The lxplus error seems to be due to a lack of sPlot table ntuples.

yipengsun commented 3 years ago

No. Still the same segmentation fault error.

yipengsun commented 3 years ago

Something in the gaudirun.py causes a segmentation fault in the glibc? This sounds terrible:

entry=1, flags=0x7ffc60d5e91c) at /mnt/build/jenkins/workspace/lcg_release_tar/BUILDTYPE/Release/COMPILER/gcc8binutils/LABEL/centos7/build/externals/Python-2.7.16/src/Python/2.7.16/Python/pythonrun.c:1371
#49 0x00007f6c8cdc1e23 in PyRun_SimpleFileExFlags (fp=fp
entry=0x1a7e2c0, filename=0x7ffc60d6097d "/cvmfs/lhcb.cern.ch/lib/lhcb/GAUDI/GAUDI_v32r2/InstallArea/x86_64+avx2+fma-centos7-gcc8-opt/scripts/gaudirun.py", closeit=closeit
entry=1, flags=flags
entry=0x7ffc60d5e91c) at /mnt/build/jenkins/workspace/lcg_release_tar/BUILDTYPE/Release/COMPILER/gcc8binutils/LABEL/centos7/build/externals/Python-2.7.16/src/Python/2.7.16/Python/pythonrun.c:957
#50 0x00007f6c8cdc2493 in PyRun_AnyFileExFlags (fp=fp
entry=0x1a7e2c0, filename=<optimized out>, closeit=closeit
entry=1, flags=flags
entry=0x7ffc60d5e91c) at /mnt/build/jenkins/workspace/lcg_release_tar/BUILDTYPE/Release/COMPILER/gcc8binutils/LABEL/centos7/build/externals/Python-2.7.16/src/Python/2.7.16/Python/pythonrun.c:761
#51 0x00007f6c8cdd5ada in Py_Main (argc=<optimized out>, argv=<optimized out>) at /mnt/build/jenkins/workspace/lcg_release_tar/BUILDTYPE/Release/COMPILER/gcc8binutils/LABEL/centos7/build/externals/Python-2.7.16/src/Python/2.7.16/Modules/main.c:641
#52 0x00007f6c8bfcb555 in __libc_start_main () from /lib64/libc.so.6
#53 0x00000000004006be in _start ()
yipengsun commented 3 years ago

I need to find a more basic Castelao script to run.

yipengsun commented 3 years ago

Svende gave me a very useful twiki, where the instructions on how to produce various types of PID ntuples are listed.

She also told me that a liaison pointed out a script that I should be run on, and here's liaison's instruction:

I guess you can go through the production of PID tuples and comment out all the samples that you don't need in the config files the list of tuples should be here

My debugging process

My initial understanding is that I should use that script and comment out the samples that we don't need, and feed in the Turbo dst input to produce ntuples. Initially the whole script runs, but no output is generated. That's when I asserted that "I just need to figure out how to feed in the input".

Indeed, at the end of that script, I see something like this:

#below for local test

#TupleFile = 'PID_modesL.root'
#DataType = '2012'
#InputType = 'MDST'
#Simulation = False
#Lumi = True
#EvtMax = -1
#Stream = 'PID'
#tesFormat = "/Event/<stream>/Phys/<line>/Particles"
#dv = DaVinci()
#dv.DataType   = DataType
#dv.InputType  = InputType
#dv.EvtMax = EvtMax
#dv.TupleFile = TupleFile

#if InputType == 'MDST':
#    rootInTes = "/Event/Strip"
##    uDstConf ( rootInTes )

#dv.Simulation = Simulation
#dv.Lumi       = Lumi

#tesFormat = tesFormat.replace('<stream>', Stream)
#dv.UserAlgorithms = parseConfiguration(tupleConfiguration, tesFormat)

And I tried to change it to something like this:


tesFormat = tesFormat.replace('<stream>', Stream)
dv.UserAlgorithms = parseConfiguration_Run2(tupleConfiguration, tesFormat, "<path_to_sweight>", None, None)

But I got an error saying the ApplySWeight doesn't accept sTableFile. At this point, I think that this script might not be updated to work with the latest Castelao.

So I go back to the twiki, and try to follow the Production of Ntuples and microDSTs (local test) section (and the Setup Instructions section).

I realized that Castelao/v10r0 is for run 3, so we should use something for run 2, that is, Castelao/v3r6 (latest v3) branch.

Then I followed the twiki verbatim, and still get a segmentation fault (I'll attach the log in a later post)

Proposed solution

I should contact the liaison myself and basically describe what we want to do (change TupleToolPid option to verbose to add isMuonTight branch for Greg/Phoebe's Muon BDT input), and the problem I have encountered. I want to make sure what I did is correct and see if we have a bug in Castelao.

I assume Castelao/v3 must work some version of the gcc, but apparently it's not with latest gcc9 on lxplus. Still, Castelao/v3r6 is provided in this gcc9 platform officially, so this might be a bug.

WDYT @afernez @manuelfs @Svende

yipengsun commented 3 years ago

Here's Castelao log: castelao_2015_pid_tuple_prod.log

BTW, we should only need Muon calibration samples (as described in the ANA note in Appx. C). so we can comment out all PIDCalib samples except the J/Psi -> mu+ mu- samples. Do you agree with my assessment here?

manuelfs commented 3 years ago

Did you try running the script in Castelao v3? I think contacting the liason makes sense in any case.

And no, we'll need samples for the other species too in order to find the misID rates.

Svende commented 3 years ago

Yipeng, I told you all last week in the meeting that it wasn't the liaison I contacted because of this problem but the former convener of the Run1-2 WG Martino Borsato. (Also I wrote the same to you on slack to you on Sunday.) Both Anton and Martino are NOT liaisons, so I am not sure that they will answer your email. The WG liaisons are listed here: https://twiki.cern.ch/twiki/bin/view/LHCbPhysics/LHCbWGLiaisons, back in December I told you who the liaison is to contact. Quoting my slack message to you from back then: 'Hi Yipeng, I looked a bit into castelao, it seems that the maintainer is Carlos Vazquez Sierra from Nikhef, I guess you could contact him for instructions or contact the SL Run1-2 Performance Liaison: Veronica Kirsebom. Maybe she has more info on that but it could be that she will rather forward your email to the conveners: Vitalii & Michael'

Svende commented 3 years ago

I just chatted with Martino, he says: 'yes it's possible it is due to gcc, I remember I had to change platform at some point. It was a long time ago for me, maybe a better bet is contacting Vitalii.' So I guess it's the best if you resend your email directly to the conveners Vitalii & Michael

yipengsun commented 3 years ago

I double checked the instruction on the twiki with Castelao/v3r4 and gcc8, and it still segfaults.

I assume this is just because the instruction is outdated, as the instructions sent by Vitallie work. No future work is planned on this.