Task Description

Error handling in vitrivr-engine currently has two major shortcomings.

An Operator's implementer decides, if an error should be handled gracefully (i.e., log and continue) or not (i.e., throw an exception). This leads to inconsistent behaviour across the a pipeline.
If an error is logged, the caller of a pipeline has no way to access error information since most of the time, errors are simply logged. This is not ideal in cases, where vitrivr-engine is used as a library rather than a local service.

I therefore propose three major changes to how errors should be handled:

In case an error occurs, operators throw an ExtractionException. This exception reports on the error condition (retrievable, name of the operator and cause) and (optionally) wraps downstream exceptions. Throwing any other exception from within an Operator is considered a programmer's error. Therefore, proper exception handling is needed.
When configuring a pipeline, one can determine what error handling mode should be employed. Currently I see two modes: CONTINUE and ABORT (we can of course discuss other modes). This will lead to the introduction of transparent error handling stages in the flow.
Regardless of what mode is employed, a per-item summary should be provided in some Context object with information about what went wrong. This Context can be accessed by the caller of a pipeline.

In addition, one can also have a discussion as to how handled errors should affect Retrievables. It might make sense to include error information at a Retrievable level as well.

Currently, this is a discussion issue. I'm open for ideas and input.

Dependencies

None

Boundary Conditions

This should be implemented in a way such that the error handling logic is injected transparently when pipelines are constructed, rather than requiring the operators to manipulate the flow.

vitrivr / vitrivr-engine

Error handling during ingest #104

Task Description

Dependencies

Boundary Conditions