art-daq / artdaq_daqinterface

Other
0 stars 1 forks source link

DAQInterface should double-check its assumptions about what a process's logfile is #26

Closed eflumerf closed 2 years ago

eflumerf commented 2 years ago

This issue has been migrated from https://cdcvs.fnal.gov/redmine/issues/23318 (FNAL account required) Originally created by @jcfreeman2 on 2019-09-24 18:00:23


An interesting issue came up today. On sbn-daq01, in the run record for run 1705 (/daq/run_records/1705), the logfiles listed in the metadata.txt file were all wrong. Probing more deeply, it appears that no logfiles were actually produced by the artdaq processes for that run. While that's its own issue, it also highlighted an area of improvement for DAQInterface: DAQInterface assumes artdaq processes always successfully produce logfiles, so it will simply take the most recent logfile in the relevant directory (/--) to be the logfile for the process in question. Perhaps it could double check that the timestamp on the logfile makes sense; in the case of today's run 1705, it wound up pointing to logfiles dated from July.

eflumerf commented 2 years ago

Comment by @jcfreeman2 on 2019-09-24 20:54:16


After discussion at today's specially-scheduled 2 PM meeting, it was generally agreed that the proper response for DAQInterface if it determines that a logfile from an artdaq process is not available is to refuse to run.

eflumerf commented 2 years ago

Comment by @jcfreeman2 on 2019-09-24 20:56:55


Also: the reason logfiles weren't available for the run this morning on SBND is that the subdirectories to which artdaq writes its root files weren't ones to which the executor had write access. Eric pointed out at today's meeting that it's nontrivial for artdaq itself to figure this out, which is why determining if this problem exists is kicked up to the DAQInterface level.

eflumerf commented 2 years ago

Comment by @jcfreeman2 on 2019-10-01 17:45:06


This issue is resolved with commit 0e141ddc03aa92e845e8a496f4db6e479e9a9327 at the head of bugfix/23318_require_logfiles. To determine the logfile for a given artdaq process, DAQInterface previously accessed bash via Popen and listed the most recent logfile in the expected subdirectory. Now, it also checks to make sure that (A) a logfile exists, and (B) the logfile's modification time is after the beginning of the boot transition. If this isn't the case, it returns itself to the "stopped" state with an error message.

eflumerf commented 2 years ago

Comment by @eflumerf on 2019-10-22 13:53:23


Code review. Tested by cd daqlogs;chown root:root .;chown -R root:root *.