Closed knoepfel closed 2 years ago
Comment by @knoepfel on 2021-10-18 14:12:41
Ray, is this the exception you expect to see?
%MSG-s ArtException: PostEndJob 18-Oct-2021 09:12:21 CDT ModuleEndJob ---- EventProcessorFailure BEGIN EventProcessor: an exception occurred during current event processing ---- ScheduleExecutionFailure BEGIN Path: ProcessingStopped. ---- OtherArt BEGIN ---- DBHANDLE_NO_TID BEGIN DbHandle could not get TID (Table ID) from DbEngine for TrkAlignTracker at first use The above exception was thrown while processing module StrawHitReco/TTmakeSH run: 501 subRun: 0 event: 1 ---- DBHANDLE_NO_TID END ---- OtherArt END Exception going through path cprLowPSeedDeM_trigger ---- ScheduleExecutionFailure END ---- EventProcessorFailure END %MSG Art has completed and will exit with status 1.
Comment by @rlcee on 2021-10-18 14:40:44
Yes, that is the exception which was not being properly rethrown
Comment by @knoepfel on 2021-10-18 14:48:11
Thanks, Ray. The exception appears to be suppressed whenever a path successfully completes after the path with the exception executes. A minimal reproducer is:
physics: { producers: { intProducer: { module_type: IntProducer ivalue: 4 } failingProducer: { module_type: FailingProducer } } tp1: [intProducer] tp2: [failingProducer] }
where intProducer
just produces a number, and failingProducer
always throws an exception. Whenever tp2
completes first, the exception raised by tp1
is suppressed. Whenever tp1
is executed first, the exception from tp2
propagates to the top, ending the job with a return value of 1. The latter is the expected behavior, which should happen regardless of the path-execution order.
Investigating the cause.
Comment by @knoepfel on 2021-10-19 16:49:54
The problem is understood--the exception system was respecting an exception only if it was generated by the last path to be processed in the event (which is slightly different than I reported above). The following commit ensures that the exception is "sticky"--i.e. an exception generated by any path is rethrown by the framework unless an exception-handling action has been configured for the job.
Please let us know when you require a new version of art with this fix.
Implemented via https://github.com/art-framework-suite/art/commit/367881770cdc7a8712be6111c8378addf4293211.
Comment by @kutschke on 2021-10-19 18:34:30
Thanks Kyle. We will discuss our needs for art upgrades at the weekly meeting tomorrow and get back to you after that. Here is what I know so far.
The main reason to upgrade is to incorporate our request for a new root feature that is now in the branch v6-24-00-patches. The PR is: https://github.com/root-project/root/pull/9049 . Is this in the right place in the root repo structure for you to make a new root for us and build a new art based on that.
When you are ready to go ahead with that we would like you to incorporate two additional fixes, this one and art-framework-suite/fhicl-cpp#8
Do you have other work that is ready to include in a new art version?
I will get back to you tomorrow if it is just these 3 things or if there are additional requests.
@kutschke, a few items:
Once we understand the timescale of the ROOT solution better, we'll be able to determine when to upgrade to a new ROOT release. Stay tuned.
This issue has been migrated from https://cdcvs.fnal.gov/redmine/issues/26339 (FNAL account required) Originally created by @rlcee on 2021-10-12 19:57:14
When running a mu2e exe with many paths, one path threw an exception, but instead of the exe exiting, the path was ended for the event and the other paths continued. This seems to be the FailPath behavior, but we do not know how this came about. Our settings should default to always rethrow. Kyle has the recipe to reproduce the effect.
Additional details
From Ray: