enso-org / enso

Enso Analytics is a self-service data prep and analysis platform designed for data teams.
https://ensoanalytics.com
Apache License 2.0
7.38k stars 323 forks source link

Better Error Trapping in `map` #8110

Closed jdunkerley closed 11 months ago

jdunkerley commented 1 year ago

Currently if an error occurs when running a map function the whole process fails and it is difficult to know what record it failed on. This is a first ticket to add this kind of handling to functions like map. Others will want it as well.

To improve this want to add user controllable error handling:

On the map method we should add on_problems parameter. Unlike other methods, it should default to Error not Warning.

--- a/distribution/lib/Standard/Base/0.0.0-dev/src/Data/Vector.enso
+++ b/distribution/lib/Standard/Base/0.0.0-dev/src/Data/Vector.enso
@@ -578,7 +578,9 @@ type Vector a
              [1, 2, 3] . map +1
     map : (Any -> Any) -> Vector Any
     map self function =
-        Vector.new self.length i-> function (self.at i)
+        Vector.new self.length i->
+            result = function (self.at i)
+            if result.is_error then Error.throw (Map_Error.Value i result.catch) else result

     ## Applies a function to each element of the vector, returning the `Vector`
        that contains all results concatenated.
@@ -1290,3 +1292,15 @@ check_start_valid start length function =
     used_start = if start < 0 then start + length else start
     if used_start < 0 || used_start > length then Error.throw (Index_Out_Of_Bounds.Error start length+1) else
         function used_start
+
+## PRIVATE
+type Map_Error
+    Value index inner_error
+
+    ## PRIVATE
+    to_display_text : Text
+    to_display_text self = "Error at index " + self.index.to_text + ": " + self.inner_error.to_display_text
+
+    ## PRIVATE
+    is_a : Any -> Boolean
+    is_a self typ = if typ == Map_Error then True else self.inner_error.is_a typ

The Map_Error wraps the failure with the index allowing for easy discovery. The is_a workaround is not recommended. Instead a review of Error.catch and allowing it to convert the caught error to a custom type (e.g. Wrapped_Error) which can then be tested against cleanly. If catching a specific type, then we should unwrap it.

GregoryTravis commented 1 year ago

The is_a workaround is not recommended. Instead a review of Error.catch and allowing it to convert the caught error to a custom type (e.g. Wrapped_Error) which can then be tested against cleanly. If catching a specific type, then we should unwrap it.

The behavior we want is something like this?

bad_result = v.map (x-> Error.throw Foo_Error)
bad_result.is_a Map_Error . should_be_true
bad_result.is_a Foo_Error . should_be_false
bad_result.catch Foo_Error . is_a Foo_Error . should_be_true

In other words, if we are catching the specific error Foo_Error, and that error is raised in map and then wrapped in Map_Error, we watch to transparently unwrap the Map_Error and catch it as a Foo_Error.

On the other hand, if we don't catch Foo_Error and the error makes its way to the top, it is seen as a Map_Error, along with the extra information that that has.

enso-bot[bot] commented 1 year ago

Greg Travis reports a new STANDUP for today (2023-11-08):

Progress: reviews; make ide run; ci problems; tests for 8110 It should be finished by 2023-11-15.

Next Day: 8110

radeusgd commented 1 year ago

The is_a workaround is not recommended. Instead a review of Error.catch and allowing it to convert the caught error to a custom type (e.g. Wrapped_Error) which can then be tested against cleanly. If catching a specific type, then we should unwrap it.

The behavior we want is something like this?

bad_result = v.map (x-> Error.throw Foo_Error)
bad_result.is_a Map_Error . should_be_true
bad_result.is_a Foo_Error . should_be_false
bad_result.catch Foo_Error . is_a Foo_Error . should_be_true

In other words, if we are catching the specific error Foo_Error, and that error is raised in map and then wrapped in Map_Error, we watch to transparently unwrap the Map_Error and catch it as a Foo_Error.

On the other hand, if we don't catch Foo_Error and the error makes its way to the top, it is seen as a Map_Error, along with the extra information that that has.

Exactly 💯

We don't want to abuse the type hierarchy to somehow make Map_Error a subtype Foo_Error, but we want the is_a behaviour to be exactly as you have shown. The difference is that when we catch the error, we should have special logic that checks if the Map_Error can be converted to Foo_Error and if yes, we do the conversion and return the converted instance.


Additionally, we don't want Error.catch to know about Map_Error - to keep coupling low.

Instead, the proposed solution was to create a special Wrapped_Error type, like:

type Wrapped_Error
    Value original_error inner_error

to which we can provide conversions.

Then we can:

Wrapped_Error.from (that : Map_Error) = Wrapped_Error.Value that that.inner_error

and the Error.catch can do something like:

     catch self (error_type = Any) (handler = x->x) =
         self.catch_primitive error_value->
             case error_value.is_a error_type of
                 True -> handler error_value
-                False -> self
+                False ->
+                    # TODO this is a prototype - it wrongly handles edge cases if we had a Nothing wrapped as error which is not forbidden
+                    wrapped = Panic.catch No_Such_Conversion (Wrapped_Error.from error_value . inner_error) _->Nothing
+                    if wrapped.is_a error_type then handler error_value else self
enso-bot[bot] commented 1 year ago

Greg Travis reports a new STANDUP for today (2023-11-09):

Progress: got test suite passing, many questions left It should be finished by 2023-11-15.

Next Day: 8110

enso-bot[bot] commented 1 year ago

Greg Travis reports a new STANDUP for today (2023-11-10):

Progress: rewrote to use original java; handled nested wrapping; Error_Spec tests; many stack overflows It should be finished by 2023-11-15.

Next Day: get all tests passing

enso-bot[bot] commented 1 year ago

Greg Travis reports a new STANDUP for today (2023-11-13):

Progress: fixing tests; stack overflows It should be finished by 2023-11-15.

Next Day: get all tests passing

enso-bot[bot] commented 1 year ago

Greg Travis reports a new STANDUP for today (2023-11-14):

Progress: fixing tests; warning doubling It should be finished by 2023-11-15.

Next Day: get all tests passing

enso-bot[bot] commented 1 year ago

Greg Travis reports a new 🔴 DELAY for today (2023-11-15):

Summary: There is 1 days delay in implementation of the Better Error Trapping in map (#8110) task. It will cause 0 days delay for the delivery of this weekly plan.

Delay Cause: unexpectedly complicated

enso-bot[bot] commented 1 year ago

Greg Travis reports a new STANDUP for today (2023-11-15):

Progress: warning double with Humbert; misc It should be finished by 2023-11-16.

Next Day: warning doubling

enso-bot[bot] commented 1 year ago

Greg Travis reports a new 🔴 DELAY for today (2023-11-16):

Summary: There is 1 days delay in implementation of the Better Error Trapping in map (#8110) task. It will cause 0 days delay for the delivery of this weekly plan.

Delay Cause: nested wrappers

enso-bot[bot] commented 1 year ago

Greg Travis reports a new STANDUP for today (2023-11-16):

Progress: nested wrappers It should be finished by 2023-11-17.

Next Day: tests passing

enso-bot[bot] commented 12 months ago

Greg Travis reports a new 🔴 DELAY for today (2023-11-20):

Summary: There is 4 days delay in implementation of the Better Error Trapping in map (#8110) task. It will cause 0 days delay for the delivery of this weekly plan.

Delay Cause: nested wrappers

enso-bot[bot] commented 12 months ago

Greg Travis reports a new STANDUP for today (2023-11-20):

Progress: nested wrappers It should be finished by 2023-11-21.

Next Day: tests passing

enso-bot[bot] commented 12 months ago

Greg Travis reports a new 🔴 DELAY for today (2023-11-21):

Summary: There is 1 days delay in implementation of the Better Error Trapping in map (#8110) task. It will cause 0 days delay for the delivery of this weekly plan.

Delay Cause: nested wrappers

enso-bot[bot] commented 12 months ago

Greg Travis reports a new STANDUP for today (2023-11-21):

Progress: nested wrappers It should be finished by 2023-11-22.

Next Day: tests passing

enso-bot[bot] commented 12 months ago

Greg Travis reports a new 🔴 DELAY for today (2023-11-22):

Summary: There is 1 days delay in implementation of the Better Error Trapping in map (#8110) task. It will cause 5 days delay for the delivery of this weekly plan.

Delay Cause: nested wrappers

enso-bot[bot] commented 12 months ago

Greg Travis reports a new 🔴 DELAY for today (2023-11-22):

Summary: There is 4 days delay in implementation of the Better Error Trapping in map (#8110) task. It will cause 0 days delay for the delivery of this weekly plan.

Delay Cause: nested wrappers

enso-bot[bot] commented 12 months ago

Greg Travis reports a new STANDUP for today (2023-11-22):

Progress: nested wrappers It should be finished by 2023-11-27.

Next Day: tests passing

enso-bot[bot] commented 11 months ago

Greg Travis reports a new 🔴 DELAY for today (2023-11-27):

Summary: There is 1 days delay in implementation of the Better Error Trapping in map (#8110) task. It will cause 0 days delay for the delivery of this weekly plan.

Delay Cause: nested wrappers

enso-bot[bot] commented 11 months ago

Greg Travis reports a new STANDUP for today (2023-11-27):

Progress: cleanup, docs, tests It should be finished by 2023-11-28.

Next Day: benchmarks

enso-bot[bot] commented 11 months ago

Greg Travis reports a new 🔴 DELAY for today (2023-11-28):

Summary: There is 1 days delay in implementation of the Better Error Trapping in map (#8110) task. It will cause 0 days delay for the delivery of this weekly plan.

Delay Cause: review

enso-bot[bot] commented 11 months ago

Greg Travis reports a new STANDUP for today (2023-11-28):

Progress: benchmarks, review It should be finished by 2023-11-29.

Next Day: review