Closed spenes closed 7 months ago
Thanks for the review @benjben!
I see a few problems with the approach you suggested. First, I think it is hard to reconstruct BadRow
from FailureEntity
s. As you know, there can be multiple FailureEntity
for one BadRow
. We need some kind of mechanism to map FailureEntity
s to their BadRow
. Other problem I see with that approach is that we would be starting to redefine BadRow
s as FailureEntity
in this case just to add a few extra bits. I am not sure it is good idea to duplicate BadRow
definitions in the Enrich codebase. Of course, other solution might be add those extra bit of information to original BadRow
definitions but I am not sure we want to go that route at this stage since that would make things more complicated.
I agree wrappers doesn't look pretty but this is kinda necessary workaround for this feature.
SchemaViolationWithExtraContext
contains only a few extra bits additional to SchemaViolation
to create FailureEntity
. This will be exactly what we will do even if define completely new FailureEntity
with structure almost same with SchemaViolation
itself.
I agree with your comment about BadRowWithFailureEntities
but don't we need both BadRow
and List[FailureEntity]
in the end ? We can postpone their creation with creating intermediate format that can be converted to both but we need to create both eventually.
As you know, there can be multiple FailureEntity for one BadRow. We need some kind of mechanism to map FailureEntitys to their BadRow
There are 3 steps that can produce a bad row:
Each of these steps would now return a List[FailureEntity]
instead of a BadRow
.
We know that all the FailureEntity
s for a step are for the same bad row, so we just need to pattern match on the first one and we'll know the type of bad row to use.
The left part of the IorT
would then be NonEmptyList[List[FailureEntity]]
.
With this approach it is all about the mapping function from FailureEntity
to FailureDetails
. We get rid of all the wrappers.
Other problem I see with that approach is that we would be starting to redefine BadRows as FailureEntity
No, we'd leave BadRow
intact. What do you think that we'd need to change it ?
Failure entities are started to be attached to incomplete events as derived contexts.
There are some cases that we need more information than in the bad rows to create failure entities therefore it isn't possible to create failure entities from bad rows directly. For this purpose, we created wrapper classes to attach extra information about failure entities.