Open mikkonie opened 1 year ago
One thing to add: we should also rethink the internal process where we always raise FlowSubmitException
from the API instead of the actual exception type.
I've already had to come up with a workaround for detecting the exception type to avoid a lot of yak shaving, see #1847.
This is something I didn't really think of before when taskflow was a separate component. We raise an exception every time a taskflow fails, whether it is an unexpected crash or an completely expected situation, like the checksum validation failing for a landing zone.
Because of this, error logs and sentry get flooded by benign "exceptions" which are really completely ok situations. Sure, the landing zone still needs to go into
FAILED
state and the user notified, but these are not software failures to be logged as errors.TBD: Best way to handle this? Simply ignoring the exceptions in Sentry is the obvious first step, but we might also reconsider when to raise these zone failures as errors and when not. And how to make that distinction.
Comments are welcome, I will think of approaches myself.