JeffersonLab / halld_recon

Reconstruction for the GlueX Detector
7 stars 9 forks source link

Data corruption in 2019-11 data #387

Open sdobbs opened 4 years ago

sdobbs commented 4 years ago

In a recent calibration run, I found the following errors: (first one is most concerning)

run 71757, file 001: JANA ERROR>>Slot from SSP/DIRC event header does not match slot from last block header (18 != 3)

run 71912, file 001: corrupted on cache, submitted

run 72188, file 001: segfault as listed below (currently trying to confirm that it's reproducable):

===========================================================

6 0x00002b6f764b52ad in ?? () from /lib64/libstdc++.so.6

7 0x00000000006e34dc in ~basic_string (this=0x2b6faef42fa0, __in_chrg=) at /usr/include/c++/4.8.2/bits/basic_string.h:539

8 jana::JEventLoop::GetFromFactory (this=0x2b6fd00008c0, t=std::vector of length -647603552683825760, capacity 550456894365317400 = {...}, tag=0x1138e0d "", data_source=

0x2b6faef430b0: jana::JEventLoop::DATA_NOT_AVAILABLE, allow_deftag=) at /u/group/halld/Software/builds/Linux_CentOS7.7-x86_64-gcc4.8.5/jana/jana_0.7.9p1^ccdb166/Linux_CentOS7.7-x86_64-gcc4.8.5/include/JANA/JEventLoop.h:456

9 0x0000000000000001 in ?? ()

10 0x00000000006e3734 in jana::JEventLoop::Get (this=0x2b6faef43040, this

entry=0x2b6fd00008c0, t=std::vector of length 0, capacity 0, tag=0x2b6faef43080 "8003vvo+", tag entry=0x1138e0d "", allow_deftag=allow_deftag entry=true) at /u/group/halld/Software/builds/Linux_CentOS7.7-x86_64-gcc4.8.5/jana/jana_0.7.9p1^ccdb166/Linux_CentOS7.7-x86_64-gcc4.8.5/include/JANA/JEventLoop.h:309

11 0x00000000006fa1e5 in jana::JEventLoop::GetSingle (this=this

entry=0x2b6fd00008c0, t= 0x2b6faef431b0: 0x0, tag=tag entry=0x1138e0d "", exception_if_not_one=exception_if_not_one entry=true) at /u/group/halld/Software/builds/Linux_CentOS7.7-x86_64-gcc4.8.5/jana/jana_0.7.9p1^ccdb166/Linux_CentOS7.7-x86_64-gcc4.8.5/include/JANA/JEventLoop.h:239

12 0x0000000000772717 in DHistogramAction_EventVertex::Perform_Action (this=0x3b8e6f8, locEventLoop=0x2b6fd00008c0, locParticleCombo=) at libraries/ANALYSIS/DHistogramActions_Independent.cc:2770

13 0x00002b6f90fa07ce in operator() (locEventLoop=0x2b6fd00008c0, this=0x3b8e6f8) at libraries/ANALYSIS/DAnalysisAction.h:125

14 DEventProcessor_monitoring_hists::evnt (this=0x3b8cb30, locEventLoop=0x2b6fd00008c0, eventnumber=) at plugins/Analysis/monitoring_hists/DEventProcessor_monitoring_hists.cc:130

15 0x00000000010d8efa in jana::JEventLoop::OneEvent (this=this

entry=0x2b6fd00008c0) at src/JANA/JEventLoop.cc:693

16 0x00000000010d9e94 in jana::JEventLoop::Loop (this=this

entry=0x2b6fd00008c0) at src/JANA/JEventLoop.cc:496

17 0x00000000010b114a in LaunchThread (arg=0x7ffe98bb8b70) at src/JANA/JApplication.cc:1382

18 0x00002b6f76041e65 in start_thread () from /lib64/libpthread.so.0

19 0x00002b6f76d7788d in clone () from /lib64/libc.so.6

===========================================================

zihlmann commented 4 years ago

which event is that? with hd_dump I see not issues in the first 30 events.

On 6/1/20 5:55 PM, Sean Dobbs wrote:

In a recent calibration run, I found the following errors: (first one is most concerning)

run 71757, file 001: JANA ERROR>>Slot from SSP/DIRC event header does not match slot from last block header (18 != 3)

run 71912, file 001: corrupted on cache, submitted

run 72188, file 001: segfault as listed below (currently trying to confirm that it's reproducable):

===========================================================

6

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_6&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=y3IGoZbn07FW_IPyC6Pa0c9Leo7WrMwyiVmSasWw6V8&e= 0x00002b6f764b52ad in ?? () from /lib64/libstdc++.so.6

7

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_7&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=gukNVwRpY8h4zjfd8W8czGXN_idtJ2Y_rRD6ED9do-Y&e= 0x00000000006e34dc in ~basic_string (this=0x2b6faef42fa0, __in_chrg=) at /usr/include/c++/4.8.2/bits/basic_string.h:539

8

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_8&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=nttn0K8fpwaIX7d5dxh1R_WaFvYU9CX2u7DOMtGyps4&e= jana::JEventLoop::GetFromFactory (this=0x2b6fd00008c0, t=std::vector of length -647603552683825760, capacity 550456894365317400 = {...}, tag=0x1138e0d "", data_source= 0x2b6faef430b0: jana::JEventLoop::DATA_NOT_AVAILABLE, allow_deftag=) at /u/group/halld/Software/builds/Linux_CentOS7.7-x86_64-gcc4.8.5/jana/jana_0.7.9p1^ccdb166/Linux_CentOS7.7-x86_64-gcc4.8.5/include/JANA/JEventLoop.h:456

9

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_9&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=-0tFxJFVOBdAFwjGgLa7kAkKFxFIWWkILHlwGnydGZo&e= 0x0000000000000001 in ?? ()

10

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_10&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=MClg1N0R4oBxChEkUMFCNSlpZy8X_tmRoHTRFjhqCxs&e= 0x00000000006e3734 in jana::JEventLoop::Get (this=0x2b6faef43040, this entry=0x2b6fd00008c0, t=std::vector of length 0, capacity 0, tag=0x2b6faef43080 "8003vvo+", tag entry=0x1138e0d "", allow_deftag=allow_deftag entry=true) at /u/group/halld/Software/builds/Linux_CentOS7.7-x86_64-gcc4.8.5/jana/jana_0.7.9p1^ccdb166/Linux_CentOS7.7-x86_64-gcc4.8.5/include/JANA/JEventLoop.h:309

11

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_11&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=NDW0uj1-ZF1xPKJMUQRqN3uxGTWdW6aLQKAN-1ThR60&e= 0x00000000006fa1e5 in jana::JEventLoop::GetSingle (this=this entry=0x2b6fd00008c0, t= 0x2b6faef431b0: 0x0, tag=tag entry=0x1138e0d "", exception_if_not_one=exception_if_not_one entry=true) at /u/group/halld/Software/builds/Linux_CentOS7.7-x86_64-gcc4.8.5/jana/jana_0.7.9p1^ccdb166/Linux_CentOS7.7-x86_64-gcc4.8.5/include/JANA/JEventLoop.h:239

12

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_12&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=dwxuKZMdck-BhCfwTHLEND8Uhfg3r9a-MPD3wsjvlhw&e= 0x0000000000772717 in DHistogramAction_EventVertex::Perform_Action (this=0x3b8e6f8, locEventLoop=0x2b6fd00008c0, locParticleCombo=) at libraries/ANALYSIS/DHistogramActions_Independent.cc:2770

13

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_13&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=dQdUmIPd56aaA6Q60kNCi0d6AOA2GPLhF1AxrKjbqyg&e= 0x00002b6f90fa07ce in operator() (locEventLoop=0x2b6fd00008c0, this=0x3b8e6f8) at libraries/ANALYSIS/DAnalysisAction.h:125

14

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_14&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=zATBnpUM63Zydh23Y9effQd4W4ioCqJPa9uH-xhGffQ&e= DEventProcessor_monitoring_hists::evnt (this=0x3b8cb30, locEventLoop=0x2b6fd00008c0, eventnumber=) at plugins/Analysis/monitoring_hists/DEventProcessor_monitoring_hists.cc:130

15

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_15&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=CSltR7pdomfAeor_QwJ4ACDEtLTFz90wCV3f4aJ4WKU&e= 0x00000000010d8efa in jana::JEventLoop::OneEvent (this=this entry=0x2b6fd00008c0) at src/JANA/JEventLoop.cc:693

16

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_16&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=6N9_7CF8GF3Gk4QDRz3qzyvZ7IuotqJZ8dTb1pRfZIs&e= 0x00000000010d9e94 in jana::JEventLoop::Loop (this=this entry=0x2b6fd00008c0) at src/JANA/JEventLoop.cc:496

17

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_17&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=d2Hq7spu8gBo8uzWFHTT_jB7Bfpf2hYDGmMFItREo-w&e= 0x00000000010b114a in LaunchThread (arg=0x7ffe98bb8b70) at src/JANA/JApplication.cc:1382

18

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_18&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=M53fB2hL9jgUvOnrgvOwsxdoFh2S2SpvGcfNmioVCEo&e= 0x00002b6f76041e65 in start_thread () from /lib64/libpthread.so.0

19

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_19&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=HpUnL5ecBP1cLnw7xR8gXS1G3ZWDhKr36gmLp0nmw-M&e= 0x00002b6f76d7788d in clone () from /lib64/libc.so.6

===========================================================

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_issues_387&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=czBTLDKkr-LSa28p91GGRphKWhXXXtbWuG-YYDICQmU&e=, or unsubscribe https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ADF7AC6G2NLRZE7RGRCR76LRUQPTVANCNFSM4NQFPYJQ&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=TChv2keT9xeA1WuChv7SXVYvPeGBr99AYbsxyuN4RGQ&e=.

zihlmann commented 4 years ago

hdview2 crashes because it looks for TOF tables in the wrong directory (TOF instead of TOF2)!

JANA >>In DTOFHit_factory, loading constants... TOF: USE WALK CORRECTION TYPE 4 Error [1270]: in [MySQLDataProvider::GetAssignmentShort(int, const string&, time_t, const string&)] No data was selected. Table '/TOF/timewalk_parms_5PAR' for run='71757', timestampt='0' and variation='default' JANA >>Error loading TOF/timewalk_parms_5PAR ! Error [1270]: in [MySQLDataProvider::GetAssignmentShort(int, const string&, time_t, const string&)] No data was selected. Table '/TOF/timing_offsets_5PAR' for run='71757', timestampt='0' and variation='default' JANA >>Error loading TOF/timing_offsets_5PAR !

On 6/1/20 5:55 PM, Sean Dobbs wrote:

In a recent calibration run, I found the following errors: (first one is most concerning)

run 71757, file 001: JANA ERROR>>Slot from SSP/DIRC event header does not match slot from last block header (18 != 3)

run 71912, file 001: corrupted on cache, submitted

run 72188, file 001: segfault as listed below (currently trying to confirm that it's reproducable):

===========================================================

6

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_6&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=y3IGoZbn07FW_IPyC6Pa0c9Leo7WrMwyiVmSasWw6V8&e= 0x00002b6f764b52ad in ?? () from /lib64/libstdc++.so.6

7

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_7&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=gukNVwRpY8h4zjfd8W8czGXN_idtJ2Y_rRD6ED9do-Y&e= 0x00000000006e34dc in ~basic_string (this=0x2b6faef42fa0, __in_chrg=) at /usr/include/c++/4.8.2/bits/basic_string.h:539

8

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_8&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=nttn0K8fpwaIX7d5dxh1R_WaFvYU9CX2u7DOMtGyps4&e= jana::JEventLoop::GetFromFactory (this=0x2b6fd00008c0, t=std::vector of length -647603552683825760, capacity 550456894365317400 = {...}, tag=0x1138e0d "", data_source= 0x2b6faef430b0: jana::JEventLoop::DATA_NOT_AVAILABLE, allow_deftag=) at /u/group/halld/Software/builds/Linux_CentOS7.7-x86_64-gcc4.8.5/jana/jana_0.7.9p1^ccdb166/Linux_CentOS7.7-x86_64-gcc4.8.5/include/JANA/JEventLoop.h:456

9

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_9&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=-0tFxJFVOBdAFwjGgLa7kAkKFxFIWWkILHlwGnydGZo&e= 0x0000000000000001 in ?? ()

10

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_10&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=MClg1N0R4oBxChEkUMFCNSlpZy8X_tmRoHTRFjhqCxs&e= 0x00000000006e3734 in jana::JEventLoop::Get (this=0x2b6faef43040, this entry=0x2b6fd00008c0, t=std::vector of length 0, capacity 0, tag=0x2b6faef43080 "8003vvo+", tag entry=0x1138e0d "", allow_deftag=allow_deftag entry=true) at /u/group/halld/Software/builds/Linux_CentOS7.7-x86_64-gcc4.8.5/jana/jana_0.7.9p1^ccdb166/Linux_CentOS7.7-x86_64-gcc4.8.5/include/JANA/JEventLoop.h:309

11

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_11&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=NDW0uj1-ZF1xPKJMUQRqN3uxGTWdW6aLQKAN-1ThR60&e= 0x00000000006fa1e5 in jana::JEventLoop::GetSingle (this=this entry=0x2b6fd00008c0, t= 0x2b6faef431b0: 0x0, tag=tag entry=0x1138e0d "", exception_if_not_one=exception_if_not_one entry=true) at /u/group/halld/Software/builds/Linux_CentOS7.7-x86_64-gcc4.8.5/jana/jana_0.7.9p1^ccdb166/Linux_CentOS7.7-x86_64-gcc4.8.5/include/JANA/JEventLoop.h:239

12

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_12&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=dwxuKZMdck-BhCfwTHLEND8Uhfg3r9a-MPD3wsjvlhw&e= 0x0000000000772717 in DHistogramAction_EventVertex::Perform_Action (this=0x3b8e6f8, locEventLoop=0x2b6fd00008c0, locParticleCombo=) at libraries/ANALYSIS/DHistogramActions_Independent.cc:2770

13

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_13&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=dQdUmIPd56aaA6Q60kNCi0d6AOA2GPLhF1AxrKjbqyg&e= 0x00002b6f90fa07ce in operator() (locEventLoop=0x2b6fd00008c0, this=0x3b8e6f8) at libraries/ANALYSIS/DAnalysisAction.h:125

14

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_14&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=zATBnpUM63Zydh23Y9effQd4W4ioCqJPa9uH-xhGffQ&e= DEventProcessor_monitoring_hists::evnt (this=0x3b8cb30, locEventLoop=0x2b6fd00008c0, eventnumber=) at plugins/Analysis/monitoring_hists/DEventProcessor_monitoring_hists.cc:130

15

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_15&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=CSltR7pdomfAeor_QwJ4ACDEtLTFz90wCV3f4aJ4WKU&e= 0x00000000010d8efa in jana::JEventLoop::OneEvent (this=this entry=0x2b6fd00008c0) at src/JANA/JEventLoop.cc:693

16

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_16&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=6N9_7CF8GF3Gk4QDRz3qzyvZ7IuotqJZ8dTb1pRfZIs&e= 0x00000000010d9e94 in jana::JEventLoop::Loop (this=this entry=0x2b6fd00008c0) at src/JANA/JEventLoop.cc:496

17

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_17&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=d2Hq7spu8gBo8uzWFHTT_jB7Bfpf2hYDGmMFItREo-w&e= 0x00000000010b114a in LaunchThread (arg=0x7ffe98bb8b70) at src/JANA/JApplication.cc:1382

18

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_18&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=M53fB2hL9jgUvOnrgvOwsxdoFh2S2SpvGcfNmioVCEo&e= 0x00002b6f76041e65 in start_thread () from /lib64/libpthread.so.0

19

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_19&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=HpUnL5ecBP1cLnw7xR8gXS1G3ZWDhKr36gmLp0nmw-M&e= 0x00002b6f76d7788d in clone () from /lib64/libc.so.6

===========================================================

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_issues_387&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=czBTLDKkr-LSa28p91GGRphKWhXXXtbWuG-YYDICQmU&e=, or unsubscribe https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ADF7AC6G2NLRZE7RGRCR76LRUQPTVANCNFSM4NQFPYJQ&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=TChv2keT9xeA1WuChv7SXVYvPeGBr99AYbsxyuN4RGQ&e=.

sdobbs commented 4 years ago

It’s pretty far in. Still need to extract the actual number

On Mon, Jun 1, 2020 at 6:38 PM zihlmann notifications@github.com wrote:

which event is that? with hd_dump I see not issues in the first 30 events.

On 6/1/20 5:55 PM, Sean Dobbs wrote:

In a recent calibration run, I found the following errors: (first one is most concerning)

run 71757, file 001: JANA ERROR>>Slot from SSP/DIRC event header does not match slot from last block header (18 != 3)

run 71912, file 001: corrupted on cache, submitted

run 72188, file 001: segfault as listed below (currently trying to confirm that it's reproducable):

===========================================================

6

< https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_6&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=y3IGoZbn07FW_IPyC6Pa0c9Leo7WrMwyiVmSasWw6V8&e=>

0x00002b6f764b52ad in ?? () from /lib64/libstdc++.so.6

7

< https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_7&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=gukNVwRpY8h4zjfd8W8czGXN_idtJ2Y_rRD6ED9do-Y&e=>

0x00000000006e34dc in ~basic_string (this=0x2b6faef42fa0, __in_chrg=) at /usr/include/c++/4.8.2/bits/basic_string.h:539

8

< https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_8&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=nttn0K8fpwaIX7d5dxh1R_WaFvYU9CX2u7DOMtGyps4&e=>

jana::JEventLoop::GetFromFactory (this=0x2b6fd00008c0, t=std::vector of length -647603552683825760, capacity 550456894365317400 = {...}, tag=0x1138e0d "", data_source= 0x2b6faef430b0: jana::JEventLoop::DATA_NOT_AVAILABLE, allow_deftag=) at

/u/group/halld/Software/builds/Linux_CentOS7.7-x86_64-gcc4.8.5/jana/jana_0.7.9p1^ccdb166/Linux_CentOS7.7-x86_64-gcc4.8.5/include/JANA/JEventLoop.h:456

9

< https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_9&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=-0tFxJFVOBdAFwjGgLa7kAkKFxFIWWkILHlwGnydGZo&e=>

0x0000000000000001 in ?? ()

10

< https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_10&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=MClg1N0R4oBxChEkUMFCNSlpZy8X_tmRoHTRFjhqCxs&e=>

0x00000000006e3734 in jana::JEventLoop::Get (this=0x2b6faef43040, this entry=0x2b6fd00008c0, t=std::vector of length 0, capacity 0, tag=0x2b6faef43080 "8003vvo+", tag entry=0x1138e0d "", allow_deftag=allow_deftag entry=true) at

/u/group/halld/Software/builds/Linux_CentOS7.7-x86_64-gcc4.8.5/jana/jana_0.7.9p1^ccdb166/Linux_CentOS7.7-x86_64-gcc4.8.5/include/JANA/JEventLoop.h:309

11

< https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_11&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=NDW0uj1-ZF1xPKJMUQRqN3uxGTWdW6aLQKAN-1ThR60&e=>

0x00000000006fa1e5 in jana::JEventLoop::GetSingle (this=this entry=0x2b6fd00008c0, t= 0x2b6faef431b0: 0x0, tag=tag entry=0x1138e0d "", exception_if_not_one=exception_if_not_one entry=true) at

/u/group/halld/Software/builds/Linux_CentOS7.7-x86_64-gcc4.8.5/jana/jana_0.7.9p1^ccdb166/Linux_CentOS7.7-x86_64-gcc4.8.5/include/JANA/JEventLoop.h:239

12

< https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_12&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=dwxuKZMdck-BhCfwTHLEND8Uhfg3r9a-MPD3wsjvlhw&e=>

0x0000000000772717 in DHistogramAction_EventVertex::Perform_Action (this=0x3b8e6f8, locEventLoop=0x2b6fd00008c0, locParticleCombo=) at libraries/ANALYSIS/DHistogramActions_Independent.cc:2770

13

< https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_13&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=dQdUmIPd56aaA6Q60kNCi0d6AOA2GPLhF1AxrKjbqyg&e=>

0x00002b6f90fa07ce in operator() (locEventLoop=0x2b6fd00008c0, this=0x3b8e6f8) at libraries/ANALYSIS/DAnalysisAction.h:125

14

< https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_14&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=zATBnpUM63Zydh23Y9effQd4W4ioCqJPa9uH-xhGffQ&e=>

DEventProcessor_monitoring_hists::evnt (this=0x3b8cb30, locEventLoop=0x2b6fd00008c0, eventnumber=) at plugins/Analysis/monitoring_hists/DEventProcessor_monitoring_hists.cc:130

15

< https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_15&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=CSltR7pdomfAeor_QwJ4ACDEtLTFz90wCV3f4aJ4WKU&e=>

0x00000000010d8efa in jana::JEventLoop::OneEvent (this=this entry=0x2b6fd00008c0) at src/JANA/JEventLoop.cc:693

16

< https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_16&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=6N9_7CF8GF3Gk4QDRz3qzyvZ7IuotqJZ8dTb1pRfZIs&e=>

0x00000000010d9e94 in jana::JEventLoop::Loop (this=this entry=0x2b6fd00008c0) at src/JANA/JEventLoop.cc:496

17

< https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_17&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=d2Hq7spu8gBo8uzWFHTT_jB7Bfpf2hYDGmMFItREo-w&e=>

0x00000000010b114a in LaunchThread (arg=0x7ffe98bb8b70) at src/JANA/JApplication.cc:1382

18

< https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_18&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=M53fB2hL9jgUvOnrgvOwsxdoFh2S2SpvGcfNmioVCEo&e=>

0x00002b6f76041e65 in start_thread () from /lib64/libpthread.so.0

19

< https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_19&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=HpUnL5ecBP1cLnw7xR8gXS1G3ZWDhKr36gmLp0nmw-M&e=>

0x00002b6f76d7788d in clone () from /lib64/libc.so.6

===========================================================

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub < https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_issues_387&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=czBTLDKkr-LSa28p91GGRphKWhXXXtbWuG-YYDICQmU&e=>,

or unsubscribe < https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ADF7AC6G2NLRZE7RGRCR76LRUQPTVANCNFSM4NQFPYJQ&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=TChv2keT9xeA1WuChv7SXVYvPeGBr99AYbsxyuN4RGQ&e= .

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/JeffersonLab/halld_recon/issues/387#issuecomment-637162333, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJAS2U6F5WTYHXOC73RFRTRUQUXBANCNFSM4NQFPYJQ .

zihlmann commented 4 years ago

something changed with ccdb:

on Friday I run run 70969 and got this output at the start: TOF: USE WALK CORRECTION TYPE 4

and just now doing it again I get: TOF: USE WALK CORRECTION TYPE 3

this is just wrong! I always use: CCDB_CONNECTION=mysql://ccdb_user@hallddb.jlab.org/ccdb

On 6/1/20 6:46 PM, Sean Dobbs wrote:

It’s pretty far in. Still need to extract the actual number

On Mon, Jun 1, 2020 at 6:38 PM zihlmann notifications@github.com wrote:

which event is that? with hd_dump I see not issues in the first 30 events.

On 6/1/20 5:55 PM, Sean Dobbs wrote:

In a recent calibration run, I found the following errors: (first one is most concerning)

run 71757, file 001: JANA ERROR>>Slot from SSP/DIRC event header does not match slot from last block header (18 != 3)

run 71912, file 001: corrupted on cache, submitted

run 72188, file 001: segfault as listed below (currently trying to confirm that it's reproducable):

===========================================================

6

<

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_6&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=y3IGoZbn07FW_IPyC6Pa0c9Leo7WrMwyiVmSasWw6V8&e=>

0x00002b6f764b52ad in ?? () from /lib64/libstdc++.so.6

7

<

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_7&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=gukNVwRpY8h4zjfd8W8czGXN_idtJ2Y_rRD6ED9do-Y&e=>

0x00000000006e34dc in ~basic_string (this=0x2b6faef42fa0, __in_chrg=) at /usr/include/c++/4.8.2/bits/basic_string.h:539

8

<

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_8&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=nttn0K8fpwaIX7d5dxh1R_WaFvYU9CX2u7DOMtGyps4&e=>

jana::JEventLoop::GetFromFactory (this=0x2b6fd00008c0, t=std::vector of length -647603552683825760, capacity 550456894365317400 = {...}, tag=0x1138e0d "", data_source= 0x2b6faef430b0: jana::JEventLoop::DATA_NOT_AVAILABLE, allow_deftag=) at

/u/group/halld/Software/builds/Linux_CentOS7.7-x86_64-gcc4.8.5/jana/jana_0.7.9p1^ccdb166/Linux_CentOS7.7-x86_64-gcc4.8.5/include/JANA/JEventLoop.h:456

9

<

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_9&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=-0tFxJFVOBdAFwjGgLa7kAkKFxFIWWkILHlwGnydGZo&e=>

0x0000000000000001 in ?? ()

10

<

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_10&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=MClg1N0R4oBxChEkUMFCNSlpZy8X_tmRoHTRFjhqCxs&e=>

0x00000000006e3734 in jana::JEventLoop::Get (this=0x2b6faef43040, this entry=0x2b6fd00008c0, t=std::vector of length 0, capacity 0, tag=0x2b6faef43080 "8003vvo+", tag entry=0x1138e0d "", allow_deftag=allow_deftag entry=true) at

/u/group/halld/Software/builds/Linux_CentOS7.7-x86_64-gcc4.8.5/jana/jana_0.7.9p1^ccdb166/Linux_CentOS7.7-x86_64-gcc4.8.5/include/JANA/JEventLoop.h:309

11

<

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_11&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=NDW0uj1-ZF1xPKJMUQRqN3uxGTWdW6aLQKAN-1ThR60&e=>

0x00000000006fa1e5 in jana::JEventLoop::GetSingle (this=this entry=0x2b6fd00008c0, t= 0x2b6faef431b0: 0x0, tag=tag entry=0x1138e0d "", exception_if_not_one=exception_if_not_one entry=true) at

/u/group/halld/Software/builds/Linux_CentOS7.7-x86_64-gcc4.8.5/jana/jana_0.7.9p1^ccdb166/Linux_CentOS7.7-x86_64-gcc4.8.5/include/JANA/JEventLoop.h:239

12

<

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_12&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=dwxuKZMdck-BhCfwTHLEND8Uhfg3r9a-MPD3wsjvlhw&e=>

0x0000000000772717 in DHistogramAction_EventVertex::Perform_Action (this=0x3b8e6f8, locEventLoop=0x2b6fd00008c0, locParticleCombo=) at libraries/ANALYSIS/DHistogramActions_Independent.cc:2770

13

<

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_13&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=dQdUmIPd56aaA6Q60kNCi0d6AOA2GPLhF1AxrKjbqyg&e=>

0x00002b6f90fa07ce in operator() (locEventLoop=0x2b6fd00008c0, this=0x3b8e6f8) at libraries/ANALYSIS/DAnalysisAction.h:125

14

<

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_14&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=zATBnpUM63Zydh23Y9effQd4W4ioCqJPa9uH-xhGffQ&e=>

DEventProcessor_monitoring_hists::evnt (this=0x3b8cb30, locEventLoop=0x2b6fd00008c0, eventnumber=) at

plugins/Analysis/monitoring_hists/DEventProcessor_monitoring_hists.cc:130

15

<

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_15&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=CSltR7pdomfAeor_QwJ4ACDEtLTFz90wCV3f4aJ4WKU&e=>

0x00000000010d8efa in jana::JEventLoop::OneEvent (this=this entry=0x2b6fd00008c0) at src/JANA/JEventLoop.cc:693

16

<

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_16&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=6N9_7CF8GF3Gk4QDRz3qzyvZ7IuotqJZ8dTb1pRfZIs&e=>

0x00000000010d9e94 in jana::JEventLoop::Loop (this=this entry=0x2b6fd00008c0) at src/JANA/JEventLoop.cc:496

17

<

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_17&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=d2Hq7spu8gBo8uzWFHTT_jB7Bfpf2hYDGmMFItREo-w&e=>

0x00000000010b114a in LaunchThread (arg=0x7ffe98bb8b70) at src/JANA/JApplication.cc:1382

18

<

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_18&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=M53fB2hL9jgUvOnrgvOwsxdoFh2S2SpvGcfNmioVCEo&e=>

0x00002b6f76041e65 in start_thread () from /lib64/libpthread.so.0

19

<

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_pull_19&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=HpUnL5ecBP1cLnw7xR8gXS1G3ZWDhKr36gmLp0nmw-M&e=>

0x00002b6f76d7788d in clone () from /lib64/libc.so.6

===========================================================

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_issues_387&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=czBTLDKkr-LSa28p91GGRphKWhXXXtbWuG-YYDICQmU&e=>,

or unsubscribe <

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ADF7AC6G2NLRZE7RGRCR76LRUQPTVANCNFSM4NQFPYJQ&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=Fcd7j2QmJNiskPIoDQEGRDidzsXwxW_vgOp1xSIYky4&s=TChv2keT9xeA1WuChv7SXVYvPeGBr99AYbsxyuN4RGQ&e=

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub

https://github.com/JeffersonLab/halld_recon/issues/387#issuecomment-637162333, or unsubscribe

https://github.com/notifications/unsubscribe-auth/AAJAS2U6F5WTYHXOC73RFRTRUQUXBANCNFSM4NQFPYJQ .

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_issues_387-23issuecomment-2D637164674&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=-Yi14JhFlQ12bcFwyjDa8NqJJVR8yGKJsupzP-5YWIQ&s=83IA-NgO8RPhFeNuEWpioONWu24BRIX2K9s0HsJuRhQ&e=, or unsubscribe https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ADF7AC5BXSWXSNZDTLB2LPLRUQVS3ANCNFSM4NQFPYJQ&d=DwMFaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=-Yi14JhFlQ12bcFwyjDa8NqJJVR8yGKJsupzP-5YWIQ&s=vjailFyE5BTLzfTmlrBG-v5G2MQEExAwQz6v_p2q96Y&e=.

sdobbs commented 4 years ago

which event is that? with hd_dump I see not issues in the first 30 events.

It's within the first 250 events, though it doesn't seem to actually cause the crash.

sdobbs commented 4 years ago

Beni, I don't see a problem with the walk correction type, but do you see the following lines when running hdview2:

JANA >>Generated via: JCalibration using CCDB for MySQL and SQLite databases JANA >>Run:9999 JANA >>URL: mysql://ccdb_user@hallddb.jlab.org/ccdb JANA >>context: default JANA >>comment: Default constants for analyzing data JANA >>Creating DGeometry: JANA >> Run requested:9999 found:9999 JANA >> Run validity range: 9999-9999 JANA >> URL="ccdb:///GEOMETRY/main_HDDS.xml" context="default" JANA >> Type="JGeometryXML" JANA >>Found 25 material maps in calib. DB JANA >>Read in 25 material maps for run 9999 containing 76153 grid points total

zihlmann commented 4 years ago

yes, see the the output below, I get still the wrong TOF directory!

JANA >>Opening source "/cache/halld/RunPeriod-2019-11/rawdata/Run070969/hd_rawdata_070969_002.evio" of type: EVIOpp  - Reads EVIO formatted data from file or ET system JANA >>Created JCalibration object of type: JCalibrationCCDB JANA >>Generated via: JCalibration using CCDB for MySQL and SQLite databases JANA >>Run:9999 JANA >>URL: mysql://ccdb_user@hallddb.jlab.org/ccdb JANA >>context: default JANA >>comment: Default constants for analyzing data loading VERSION 3 JANA >>Creating DGeometry: JANA >>  Run requested:9999  found:9999 JANA >>  Run validity range: 9999-9999 JANA >>  URL="ccdb:///GEOMETRY/main_HDDS.xml" context="default" JANA >>  Type="JGeometryXML" JANA >>Found 25 material maps in calib. DB JANA >>Read in 25 material maps for run 9999 containing 76153 grid points total src/JANA/JGeometryXML.cc:348 Node or attribute not found for xpath "//composition[@name='forwardTOF_bottom3']/mposY[@volume='FTOL']/@ncopy". src/JANA/JGeometryXML.cc:348 Node or attribute not found for xpath "//composition[@name='forwardTOF_top3']/mposY[@volume='FTOL']/@ncopy". src/JANA/JGeometryXML.cc:348 Node or attribute not found for xpath "//composition[@name='forwardTOF_bottom3']/mposY[@volume='FTOL']/@ncopy". src/JANA/JGeometryXML.cc:348 Node or attribute not found for xpath "//composition[@name='forwardTOF_top3']/mposY[@volume='FTOL']/@ncopy". Reading preferences from "/home/zihlmann/.hdview2" ... JANA >>Created JCalibration object of type: JCalibrationCCDB JANA >>Generated via: JCalibration using CCDB for MySQL and SQLite databases JANA >>Run:70969 JANA >>URL: mysql://ccdb_user@hallddb.jlab.org/ccdb JANA >>context: default JANA >>comment: Default constants for analyzing data JANA >>Reading Magnetic field map from Magnets/Solenoid/solenoid_1350A_poisson_20160222 ...  Nx=221 Ny=1 Nz=701 )  at 0x3c9a990 Reading fine-mesh B-field data from /group/halld/www/halldweb/html/resources/Magnets/Solenoid/finemeshes/solenoid_1350A_poisson_20160222  rmin: 0 rmax: 88.5 dr: 0.1 zmin: 0 zmax: 600 dz: 0.1  Number of points in z = 6000  Number of points in r = 885 JANA >>154921 entries found (Created Magnetic field map of type DMagneticFieldMapFineMesh JANA >>Creating DTranslationTable for run 70969 JANA >>Reading translation table from calib DB: Translation/DAQ2detector ... JANA >>39752 channels defined in translation table ----------- New Event 0  (run 70969) ------------- JANA >>In DSCHit_factory, loading constants... src/JANA/JGeometryXML.cc:348 Node or attribute not found for xpath "//composition[@name='forwardTOF_bottom3']/mposY[@volume='FTOL']/@ncopy". src/JANA/JGeometryXML.cc:348 Node or attribute not found for xpath "//composition[@name='forwardTOF_top3']/mposY[@volume='FTOL']/@ncopy". src/JANA/JGeometryXML.cc:348 Node or attribute not found for xpath "//composition[@name='forwardTOF_bottom3']/mposY[@volume='FTOL']/@ncopy". src/JANA/JGeometryXML.cc:348 Node or attribute not found for xpath "//composition[@name='forwardTOF_top3']/mposY[@volume='FTOL']/@ncopy". Created TGeoManager :0xb434a80 src/JANA/JGeometryXML.cc:348 Node or attribute not found for xpath "//section/composition/posXYZ[@volume='DIRC']/@X_Y_Z". libraries/HDGEOMETRY/DGeometry.cc:1843 Unable to retrieve DIRC position. src/JANA/JGeometryXML.cc:348 Node or attribute not found for xpath "//section/composition/posXYZ[@volume='TRDGEM']/@X_Y_Z". libraries/HDGEOMETRY/DGeometry.cc:1869 Unable to retrieve TRD position. JANA >> Beam spot: x=0.200005 y=-0.017351 z=65 dx/dz=-0.000684869 dy/dz=0.000356484 JANA >>vertex constraint: JANA >>In DTOFHit_factory, loading constants... TOF: USE WALK CORRECTION TYPE 3 JANA >>Reading DIRC LUT TTree from /group/halld/www/halldweb/html/resources//DIRC/LUT/lut_production_2020_v01.root ... JANA >>In DCDCHit_factory, loading constants... JANA >>In DFDCHit_factory, loading constants... src/JANA/JGeometryXML.cc:348 Node or attribute not found for xpath "//section/composition/posXYZ[@volume='DIRC']/@X_Y_Z". libraries/HDGEOMETRY/DGeometry.cc:1843 Unable to retrieve DIRC position. JANA >>In DBCALHit_factory, loading constants... JANA >>In DFCALHit_factory, loading constants... libraries/HDGEOMETRY/DGeometry.cc:1812 FCAL position: (x,y,z)=(0.529,-0.002,624.906) Error [1270]: in [MySQLDataProvider::GetAssignmentShort(int, const string&, time_t, const string&)] No data was selected. Table '/FCAL/energy_dependence_correction_vs_ring' for run='70969', timestampt='0' and variation='default'

On 6/1/20 7:21 PM, Sean Dobbs wrote:

Beni, I don't see a problem with the walk correction type, but do you see the following lines when running hdview2:

JANA >>Generated via: JCalibration using CCDB for MySQL and SQLite databases JANA >>Run:9999 JANA >>URL: mysql://ccdb_user@hallddb.jlab.org/ccdb JANA >>context: default JANA >>comment: Default constants for analyzing data JANA >>Creating DGeometry: JANA >> Run requested:9999 found:9999 JANA >> Run validity range: 9999-9999 JANA >> URL="ccdb:///GEOMETRY/main_HDDS.xml" context="default" JANA >> Type="JGeometryXML" JANA >>Found 25 material maps in calib. DB JANA >>Read in 25 material maps for run 9999 containing 76153 grid points total

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JeffersonLab_halld-5Frecon_issues_387-23issuecomment-2D637176640&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=4-EHFhyUgtXJdop6faFoqoybaRoDkY8kpUGjkietoe4&s=dbdX19Hs43jwpjocskMpsqCjpPWXRDSgRmPgjR86H0A&e=, or unsubscribe https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ADF7ACZJL3G7BDVYWUCJNRDRUQZWDANCNFSM4NQFPYJQ&d=DwMCaQ&c=CJqEzB1piLOyyvZjb8YUQw&r=Hy7ijcc6pcMoP-QxZxtQH4-vodW_VGkrA9xiBc7InXk&m=4-EHFhyUgtXJdop6faFoqoybaRoDkY8kpUGjkietoe4&s=dIDGnKBH6MRijnoZTyUVEea_dJqF7Fa5OzssRDSqtQ8&e=.

aaust commented 4 years ago

Look at the event number. It's 9999!

sdobbs commented 4 years ago

The run number, but yeah, this is a known issue with how certain parts of the application are initialized. I'm working on a workaround...

sdobbs commented 4 years ago

Note that you can specify a particular run number, like (in this case) -PRUNNUMBER=70969

zihlmann commented 4 years ago

this 9999 stuff was not present when running on Friday. today I updated to the latest code and now get this 9999 crap.

sdobbs commented 4 years ago

For run 71757, file 001, after ~1.5k events I get the following stack trace (clearly not 100% reproducible):

===========================================================

6 DEVIOWorkerThread::ParseSSPBank (this=this

entry=0x7f3734004820, rocid=rocid entry=92, iptr= 0x7f37317f32f8: 0x7f36bc00e324, iend=iend entry=0x7f36bc016684) at /usr/include/c++/4.8.2/bits/stl_vector.h:734

7 0x0000000000f7b67c in DEVIOWorkerThread::ParseDataBank (this=this

entry=0x7f3734004820, iptr= 0x7f37317f32f8: 0x7f36bc00e324, iend=iend entry=0x7f36bc016684) at libraries/DAQ/DEVIOWorkerThread.cc:1009

8 0x0000000000f7b917 in DEVIOWorkerThread::ParsePhysicsBank (this=this

entry=0x7f3734004820, iptr= 0x7f37317f32f8: 0x7f36bc00e324, iend=iend entry=0x7f36bc089af8) at libraries/DAQ/DEVIOWorkerThread.cc:757

9 0x0000000000f7d1e0 in DEVIOWorkerThread::ParseBank (this=this

entry=0x7f3734004820) at libraries/DAQ/DEVIOWorkerThread.cc:352

10 0x0000000000f7e098 in DEVIOWorkerThread::MakeEvents (this=this

entry=0x7f3734004820) at libraries/DAQ/DEVIOWorkerThread.cc:271

11 0x0000000000f7f74a in DEVIOWorkerThread::Run (this=0x7f3734004820) at libraries/DAQ/DEVIOWorkerThread.cc:111

12 0x00007f37547f8070 in ?? () from /lib64/libstdc++.so.6

13 0x00007f3754c55e65 in start_thread () from /lib64/libpthread.so.0

14 0x00007f3753f5b88d in clone () from /lib64/libc.so.6

===========================================================

zihlmann commented 4 years ago

ignore my comments above, they are not related to the issue at hand: I do see the same error message when using hd_root and the plugin monitoring_hists however hd_root does not crash:

here the error message as shown already at the start of the issue:

JANA ERROR>>Slot from SSP/DIRC event header does not match slot from last block header (18 != 3) ERROR: request for channel number of inactive block! row 28 column 28

however hd_root does not crash after the message and continues past event 1.5k events using the plugin monitoring_hists. this is with file /cache/halld/RunPeriod-2019-11/rawdata/Run071757/hd_rawdata_071757_001.evio

hd_root was compiled using the latest branch content on github currently running past 2.8k events no crash.

sdobbs commented 4 years ago

Yeah I tried 4 times interactively and got the crash shown above once out of these four times.

On Tue, Jun 2, 2020 at 6:41 AM zihlmann notifications@github.com wrote:

ignore my comments above, they are not related to the issue at hand: I do see the same error message when using hd_root and the plugin monitoring_hists however hd_root does not crash:

here the error message as shown already at the start of the issue:

JANA ERROR>>Slot from SSP/DIRC event header does not match slot from last block header (18 != 3) ERROR: request for channel number of inactive block! row 28 column 28

however hd_root does not crash after the message and continues past event 1.5k events using the plugin monitoring_hists. this is with file /cache/halld/RunPeriod-2019-11/rawdata/Run071757/hd_rawdata_071757_001.evio

hd_root was compiled using the latest branch content on github

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/JeffersonLab/halld_recon/issues/387#issuecomment-637452682, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJAS2XCQNNWT4BI745DMNDRUTJNRANCNFSM4NQFPYJQ .

jrstevenjlab commented 4 years ago

Looking back at the monitoring launches: this SSP message appeared for the same file during the ver01, ver08 and ver12 monitoring launches. In all three cases the monitoring histograms were produced and from a first look I don't see problems with the DIRC data in that file or the file directly after it (i.e. hits are correlated in time with other detectors/tracks and the photon yield is roughly as expected).

zihlmann commented 4 years ago

running 5 times single threaded and 5 times multi threaded(6) and I do NOT see a crash. always running way past 1.5k events

the two errors JANA ERROR>>Slot from SSP/DIRC event header does not match slot from last block header (18 != 3) ERROR: request for channel number of inactive block! row 28 column 28 Are NOT from the same event. The first error is related to event 177 or around that number while the second error is related to event 1136 or a little later. the event numbers are most likely a littler higher because the reader is always a little ahead of the analyzer.

sdobbs commented 4 years ago

It's good to hear that it doesn't look like there's data corruption - probably it's just some bug in the parser?

I don't keep logs for each calibration launch, but it looks like this run fails more often than not. A typical set of plugins is: HLDetectorTiming,monitoring_hists,PSPair_online,RF_online,TAGM_TW,CDC_amp,CDC_TimeToDistance,CDC_dedx

In any case, I'll mark this as low priority.

zihlmann commented 4 years ago

I finally see the crash too. However I see two different ways it crashes 1) just dies with "segmentation fault" no outer output 2) it crashes with lots of info: (note: 2 times it was event 336, this time it is 358) JANA ERROR>> didn't sleep full 0.5 seconds!d) 16.0Hz (avg.: 20.2Hz)
358.0 events processed (368.0 events read) 30.0Hz (avg.: 21.0Hz)

=========================================================== There was a crash. This is the entire stack trace of all threads:

Thread 12 (Thread 0x7f0e9770a700 (LWP 137181)):

0 0x00007f0ecfaba9f5 in pthread_cond_wait

GLIBC_2.3.2 () from /lib64/libpthread.so.0

1 0x00000000010be680 in jana::JApplication::EventBufferThread (this=this

entry=0x7ffda16a9440) at src/JANA/JApplication.cc:742

2 0x00000000010be949 in LaunchEventBufferThread (arg=0x7ffda16a9440) at src/JANA/JApplication.cc:666

3 0x00007f0ecfab6e65 in start_thread () from /lib64/libpthread.so.0

4 ......

sdobbs commented 4 years ago

From monitoring version 14: Run 71561, file 004

libraries/DAQ/DEVIOWorkerThread.cc:357 Unknown outer EVIO bank tag: c4a7
JANA ERROR>>
JANA ERROR>>?JException: code = 0 text = WARNING: unknown bank type (0x98)
JANA ERROR>>
JANA ERROR>>File: libraries/DAQ/swap_bank.cc, Line: 91
JANA ERROR>>
JANA ERROR>>?JException: code = 0 text = WARNING: unknown bank type (0x91)
JANA ERROR>>
JANA ERROR>>File: libraries/DAQ/swap_bank.cc, Line: 91
Exit code: -1