Open DraTeots opened 1 year ago
It looks there is NO non-deterministic behavior. The problem is that DBG macro uses cerr, which output is not synchronized with cout. So if I replace DBG macro in evio parser with:
std::cout << "Unknown outer EVIO bank tag: " << std::hex << tag << std::dec << std::endl;
std::cout<<"event_len="<<event_len<<" event_head="<<event_head<<endl;
I now see those messages strictly once per event in the beginning of each event (when EVIO parsing happens)
[2023-03-11 01:27:21.962] [SingleEvioEventFileSource] [debug] SingleEvioEventFileSource::GetEvent
[2023-03-11 01:27:21.964] [SingleEvioEventFileSource] [debug] Block 9 read_ok!
Unknown outer EVIO bank tag: 0
event_len=0 event_head=0
[2023-03-11 01:27:21.965] [SingleEvioEventFileSource] [trace] Parsed block had 1 events:
[2023-03-11 01:27:21.965] [SingleEvioEventFileSource] [trace] Event #0 in block with event-number=9:
[2023-03-11 01:27:21.965] [SingleEvioEventFileSource] [trace] Factory = Df125Config NumObjects = 1
[2023-03-11 01:27:21.965] [SingleEvioEventFileSource] [trace] Factory = DCODAROCInfo NumObjects = 1
[2023-03-11 01:27:21.965] [SingleEvioEventFileSource] [trace] Factory = Df125FDCPulse NumObjects = 5
[2023-03-11 01:27:21.965] [SingleEvioEventFileSource] [trace] Factory = Df125TriggerTime NumObjects = 5
[2023-03-11 01:27:21.965] [SingleEvioEventFileSource] [trace] Factory = DGEMSRSWindowRawData NumObjects = 1280
I do not have your print_evio plugin so I can not print the data, but I have no "Unknown outer EVIO bank tag: 0" with "CDAQfile".
[hdtrdops@gluon200 JANA4ML4FPGA]$ ./install/bin/jana4ml4fpga -Pplugins=CDAQfile ../DATA/hd_rawdata_002635_000.evio -Pjana:nevents=10
[INFO] Status: 10 events processed 327.6 Hz (327.6 Hz avg)
[INFO] All workers have stopped.
[INFO] Merging threads ...
[INFO] Event processing ended.
[INFO] JArrowProcessingController: Final Report
Thread team size [count]: 1
Total uptime [s]: 0.03052
Uptime delta [s]: 0.03052
Completed events [count]: 10
Inst throughput [Hz]: 328
Avg throughput [Hz]: 328
Sequential bottleneck [Hz]: 390
Parallel bottleneck [Hz]: 2.75e+06
Efficiency [0..1]: 0.84
+--------------------------+------------+--------+-----+---------+-------+--------+---------+-------------+
| Name | Status | Type | Par | Threads | Chunk | Thresh | Pending | Completed |
+--------------------------+------------+--------+-----+---------+-------+--------+---------+-------------+
| sources | Finished | Source | F | 0 | 40 | - | - | 10 |
| processors | Finished | Sink | T | 0 | 1 | 80 | 0 | 10 |
+--------------------------+------------+--------+-----+---------+-------+--------+---------+-------------+
+--------------------------+-------------+--------------+----------------+--------------+----------------+
| Name | Avg latency | Inst latency | Queue latency | Queue visits | Queue overhead |
| | [ms/event] | [ms/event] | [ms/visit] | [count] | [0..1] |
+--------------------------+-------------+--------------+----------------+--------------+----------------+
| sources | 2.56 | 2.03 | 0.000681 | 11 | 0.000292 |
| processors | 0.000364 | 0.00015 | 0.000564 | 50 | 0.886 |
+--------------------------+-------------+--------------+----------------+--------------+----------------+
+----+----------------------+-------------+------------+-----------+----------------+------------------+
| ID | Last arrow name | Useful time | Retry time | Idle time | Scheduler time | Scheduler visits |
| | | [ms] | [ms] | [ms] | [ms] | [count] |
+----+----------------------+-------------+------------+-----------+----------------+------------------+
| 0 | idle | 0.184 | 0 | 0 | 0.00035 | 21 |
+----+----------------------+-------------+------------+-----------+----------------+------------------+
[2023-03-11 01:39:07.904] [CDAQEVIOFileSource] [info] Closing EVIO file ../DATA/hd_rawdata_002635_000.evio
EVIO Statistics for ../DATA/hd_rawdata_002635_000.evio :
------------------------
Nblocks: 10
Nevents: 10
Nerrors: 0
Nbad_blocks: 0
Nbad_events: 0
"Unknown outer EVIO bank tag: 0" happens inside
auto events = parser.ParseEVIOBlockedEvent(block, event);
so it doesn't matter which plugin is used further in processing chain. But I commited print_evio in process_physics_events branch
My run setup is
-Pjana:nevents=10 is important as what I will write about happens on event #10
It prints:
What is important here, when I launch it again and again - the same parameters, same terminal, same command one run after another it prints this message randomly 1, 2 or 3 times (!!!)
Which suggests there is something very non deterministic. Which highly probably suggests some memory corruption, like accessing some stale buffer or something similar.