enso-org / enso

Enso Analytics is a self-service data prep and analysis platform designed for data teams.
https://ensoanalytics.com
Apache License 2.0
7.38k stars 323 forks source link

Support `on_problems=Problem_Behavior.Report_Warning` and `Map_Error` wrapping in `Vector.map` #8371

Closed GregoryTravis closed 10 months ago

GregoryTravis commented 12 months ago

Attaching Map_Error values to warnings generating during nested calls to Vector.map results in duplicate errors.

For example, if a warning is generated for x=3 here:

fun x = ... generates My_Error
[[1, 2], [3, 4]].map (a-> a.map fun)

The result will have both of these warnings:

Map_Error.Error 0 (Map_Error.Error 1 My_Error.Value)
Map_Error.Error 1 My_Error.Value

The problem stems from the way warnings are propagated in certain cases: values inside a Vector inherit warnings from their parents, and parents gather warnings from their children. In addition, transforming warnings with Warning.map_attached_warning sometimes result in both the original and transformed warnings being attached.

[] Determine whether to propagate warnings on vector construction, on read, or at query time (Warning.get_all) [] Determine the desired semantics of Map_Error; in particular, should the index in Map_Error be the position at which the error was generated, or the position the value has in whatever other vector it is contained within?

enso-bot[bot] commented 11 months ago

Greg Travis reports a new STANDUP for today (2023-11-29):

Progress: investigate warnings duplication It should be finished by 2023-12-08.

Next Day: more

enso-bot[bot] commented 11 months ago

Greg Travis reports a new STANDUP for today (2023-11-30):

Progress: investigate warnings duplication It should be finished by 2023-12-08.

Next Day: more

enso-bot[bot] commented 11 months ago

Greg Travis reports a new STANDUP for today (2023-12-04):

Progress: investigate warnings duplication It should be finished by 2023-12-08.

Next Day: more

enso-bot[bot] commented 11 months ago

Greg Travis reports a new STANDUP for today (2023-12-05):

Progress: implement wrap-on-get-all It should be finished by 2023-12-08.

Next Day: fix tests

GregoryTravis commented 11 months ago

I am currently exploring a design change to this.

The original design was that Map_Error is added only in calls to map, to indicate the index at which the error occurred.

This is complicated by the way we currently propagate warnings:

In summary, the original proposal gives a good error message for the common simple case of a non-nested map node, but as the warning propagates past that node to other nodes, it can become misleading.

I propose the following alternative: Map_Error wrappers are added when warnings are retrieved using Warning.get_all, and warnings are not propagated from elements to their containing vectors, or vice versa.

Advantages:

Disadvantages:

I'll address both disadvantages:

The Map_Error wrapper no longer indicates the index of the original problem

While it is useful to be able to know the original position of the problem, this only works well when inspecting the warnings on the map node itself. As the values propagate further down the graph, this information might become confusing. However, when a user sees warnings in the graph, it is natural to look at the earliest warning, which would be the original problem map node, with the index at which the error first occurred.

Elements do not inherit the warnings of their siblings

It's possible to come up with a case where a warning can only be understood by knowing about an error that occurred on an element's sibling further up the graph. However, unless the original map call is buried inside a node (rather than being the node itself), the original error will be visible at the original map node, with the proper index attached to it. In other words, this problem affects library writers more than IDE users.

cc @JaroslavTulach @radeusgd @jdunkerley

radeusgd commented 11 months ago
  1. I like the proposal of only keeping the Map_Error when looking directly at the result of a map and 'unwrapping' the underlying error if the warning is propagated further. 👍

  2. I'm not sure about

warnings are not propagated from elements to their containing vectors, or vice versa.

Maybe that's OK, but it means that in GUI, if I have a node that returns a pair [a, b] and both a and b have warnings; the node itself does not show the warnings because they are contained 'deeper' in the structure and do not 'float up'. IMO that is not very good because the warnings effectively become 'hidden'.

Do I understand your proposal correctly?

Do you think that this is OK? Maybe it is, because once we actually depend on the values - e.g. do [a, b].fold 0 (+) ===> (0+a+b) such resulting value would normally inherit both the warning from a and b.

Still, intuitively, I feel like the node returning [a, b] should display such warnings, because the warnings are most likely related not only to values a and b but the whole operation returning such a pair.

However, maybe we could achieve that with the new semantics still, just by adapting our APIs accordingly - all APIs that 'risk' hiding the warnings deep in the returned structure would need to 'uncover' it via a construct like a.if_not_error <| b.if_not_error <| [a, b] (with such construct the warnings are forced to 'float up' and be attached directly to the result).

Alternatively, maybe we need to consider displaying some more detailed info than just Warning.get_all in the GUI? All in all - I think the proposed solution is good - however we need to consider its implications and ensure cases like the one described above work well.

  1. This discussion suggests to me that we actually may want to distinguish warnings that have been 'added in the current node' from warnings that are 'propagated from previous issues attached to input values'. Maybe it should be possible to have Warning.get_only_immediate, or the structure returned by Warning.get_all should contain something like is_coming_from_input:Boolean (or other name)?

On one hand, we want to propagate the warning, unless explicitly cleared (by remove_warnings), in a 'poisonous' way - if an operation triggers a warning, all donwstream operations that depend on it should contain the warning to indicate that the results may be affected by the issue. On the other hand, the immediate warnings are surely more important for each node - maybe in the future we could show the inherited ones as more 'grayed out' and somehow make the immediate warnings stand out more? (Once we get a proper warning visualization...)

GregoryTravis commented 11 months ago

the node itself does not show the warnings

Actually, the vector node does show the warnings. When you call Warning.get_all() on a vector, it gathers all the warnings from the elements of the vector, and so on, recursively. (I believe there is caching and/or a flag to make this fast.)

That happens here.

My change would only require that the Map_Error wrapper is added during get_all() so that the warnings on the vector have the added index information.

I have already implemented this and am working on fixing test cases, but it looks like there aren't any real problems with it so far.

enso-bot[bot] commented 11 months ago

Greg Travis reports a new STANDUP for today (2023-12-06):

Progress: move _primitives to extensions; proposal for propagation redesign It should be finished by 2023-12-08.

Next Day: more discussion; fix tests

enso-bot[bot] commented 11 months ago

Greg Travis reports a new STANDUP for today (2023-12-07):

Progress: none; ci failures for 8110, aoc It should be finished by 2023-12-08.

Next Day: more discussion; fix tests

JaroslavTulach commented 11 months ago

My change would only require that the Map_Error wrapper is added during get_all() so that the warnings on the vector have the added index information.

Making sure that Warnings/errors are converted to Map_Error wrappers here is fine, I believe.

CCing @hubertp as the author of WarningsLibrary.

hubertp commented 11 months ago

The proposal looks sane. I had similar concerns as @radeusgd but it seem you are addressing them.

enso-bot[bot] commented 11 months ago

Greg Travis reports a new 🔴 DELAY for today (2023-12-08):

Summary: There is 3 days delay in implementation of the Support on_problems=Problem_Behavior.Report_Warning and Map_Error wrapping in Vector.map (#8371) task. It will cause 0 days delay for the delivery of this weekly plan.

Delay Cause: many unforseen complications

enso-bot[bot] commented 11 months ago

Greg Travis reports a new STANDUP for today (2023-12-08):

Progress: none; ci failures for 8110, aoc It should be finished by 2023-12-11.

Next Day: more discussion; fix tests

enso-bot[bot] commented 11 months ago

Greg Travis reports a new 🔴 DELAY for today (2023-12-11):

Summary: There is 1 days delay in implementation of the Support on_problems=Problem_Behavior.Report_Warning and Map_Error wrapping in Vector.map (#8371) task. It will cause 0 days delay for the delivery of this weekly plan.

Delay Cause: many unforseen complications

enso-bot[bot] commented 11 months ago

Greg Travis reports a new STANDUP for today (2023-12-11):

Progress: none; ci failures for 8110, aoc It should be finished by 2023-12-12.

Next Day: more discussion; fix tests

GregoryTravis commented 11 months ago

Showing the approach in the IDE:

map

The output vector shows a wrapped error, while the individual value shows an unwrapped error.

map

Vector construction

A value placed in a Vector by other means (not by map) shows the same behavior.

cons

Nested maps

Each level of nesting adds one layer of wrapping; each vector lookup shows the unwrapped (or partially unwrapped) value. (We are not actually unwrapping the error in this case, but rather not wrapping in the first place. Errors do not automatically propagate from vector to element or vice versa.)

nested

Column

The changes do not apply to Column. Since a Column is a regular value and is not treated specially by warnings code, it inherits all warnings from the Vector it is constructed from.

column

Report_Warning

In Report_Warning mode, the error is converted to a warning on Nothing.

err
enso-bot[bot] commented 11 months ago

Greg Travis reports a new 🔴 DELAY for today (2023-12-12):

Summary: There is 1 days delay in implementation of the Support on_problems=Problem_Behavior.Report_Warning and Map_Error wrapping in Vector.map (#8371) task. It will cause 0 days delay for the delivery of this weekly plan.

Delay Cause: many unforseen complications

enso-bot[bot] commented 11 months ago

Greg Travis reports a new STANDUP for today (2023-12-12):

Progress: fixing tests, debugging error propagation It should be finished by 2023-12-13.

Next Day: more discussion

enso-bot[bot] commented 11 months ago

Greg Travis reports a new 🔴 DELAY for today (2023-12-13):

Summary: There is 1 days delay in implementation of the Support on_problems=Problem_Behavior.Report_Warning and Map_Error wrapping in Vector.map (#8371) task. It will cause 0 days delay for the delivery of this weekly plan.

Delay Cause: many unforseen complications

enso-bot[bot] commented 11 months ago

Greg Travis reports a new STANDUP for today (2023-12-13):

Progress: Figure out why enso fails immediately with “Killed: 9”. Implemented get_all-based approach, but I’m getting a mysterious ‘unsupported specialization’ error in some very straightforward enso code. It should be finished by 2023-12-14.

Next Day: more discussion

enso-bot[bot] commented 11 months ago

Greg Travis reports a new 🔴 DELAY for today (2023-12-14):

Summary: There is 1 days delay in implementation of the Support on_problems=Problem_Behavior.Report_Warning and Map_Error wrapping in Vector.map (#8371) task. It will cause 0 days delay for the delivery of this weekly plan.

Delay Cause: many unforseen complications

enso-bot[bot] commented 11 months ago

Greg Travis reports a new STANDUP for today (2023-12-14):

Progress: Reported two possible warning-related bugs (WithWarnings with 0 warnings; cannot convert Column to its own type). Debugged warning self-attachment. Make hasWarnings() and getWarnings() behave consistently. Fixed warning wrapping. It should be finished by 2023-12-15.

Next Day: more discussion

enso-bot[bot] commented 11 months ago

Greg Travis reports a new 🔴 DELAY for today (2023-12-15):

Summary: There is 3 days delay in implementation of the Support on_problems=Problem_Behavior.Report_Warning and Map_Error wrapping in Vector.map (#8371) task. It will cause 0 days delay for the delivery of this weekly plan.

Delay Cause: many unforseen complications

enso-bot[bot] commented 11 months ago

Greg Travis reports a new STANDUP for today (2023-12-15):

Progress: Finished get_all approach, which fails for e.g. Column warning propagation. Now trying WarningsLibrary getElementWarnings approach. It should be finished by 2023-12-18.

Next Day: more discussion

enso-bot[bot] commented 11 months ago

Greg Travis reports a new 🔴 DELAY for today (2023-12-18):

Summary: There is 1 days delay in implementation of the Support on_problems=Problem_Behavior.Report_Warning and Map_Error wrapping in Vector.map (#8371) task. It will cause 0 days delay for the delivery of this weekly plan.

Delay Cause: many unforseen complications

enso-bot[bot] commented 11 months ago

Greg Travis reports a new STANDUP for today (2023-12-18):

Progress: Read vector elements without propagation; eliminate wrapped/unwrapped duplicates; find strange new test failures. It should be finished by 2023-12-19.

Next Day: more discussion

enso-bot[bot] commented 11 months ago

Greg Travis reports a new 🔴 DELAY for today (2023-12-19):

Summary: There is 1 days delay in implementation of the Support on_problems=Problem_Behavior.Report_Warning and Map_Error wrapping in Vector.map (#8371) task. It will cause 0 days delay for the delivery of this weekly plan.

Delay Cause: many unforseen complications

enso-bot[bot] commented 11 months ago

Greg Travis reports a new STANDUP for today (2023-12-19):

Progress: Read vector elements without propagation; investigate HostObject case; fix self-attached warnings; handle array proxy It should be finished by 2023-12-20.

Next Day: nested vectors

enso-bot[bot] commented 11 months ago

Greg Travis reports a new 🔴 DELAY for today (2023-12-20):

Summary: There is 1 days delay in implementation of the Support on_problems=Problem_Behavior.Report_Warning and Map_Error wrapping in Vector.map (#8371) task. It will cause 0 days delay for the delivery of this weekly plan.

Delay Cause: many unforseen complications

enso-bot[bot] commented 11 months ago

Greg Travis reports a new STANDUP for today (2023-12-20):

Progress: Rewrite yet again, combining paths into a flag and allowoing nested wrapping, get a few tests passing It should be finished by 2023-12-21.

Next Day: tests passing

enso-bot[bot] commented 11 months ago

Greg Travis reports a new 🔴 DELAY for today (2023-12-21):

Summary: There is 1 days delay in implementation of the Support on_problems=Problem_Behavior.Report_Warning and Map_Error wrapping in Vector.map (#8371) task. It will cause 0 days delay for the delivery of this weekly plan.

Delay Cause: many unforseen complications

enso-bot[bot] commented 11 months ago

Greg Travis reports a new STANDUP for today (2023-12-21):

Progress: Finish rewrite, all tests passing, fix nested map error It should be finished by 2023-12-22.

Next Day: literally anythingn else

GregoryTravis commented 11 months ago

Now, wrapped errors are attached to vectors, but not to individual elements.

Screen Shot 2023-12-22 at 1 00 34 PM
enso-bot[bot] commented 11 months ago

Greg Travis reports a new 🔴 DELAY for today (2023-12-22):

Summary: There is 11 days delay in implementation of the Support on_problems=Problem_Behavior.Report_Warning and Map_Error wrapping in Vector.map (#8371) task. It will cause 0 days delay for the delivery of this weekly plan.

Delay Cause: many unforseen complications

enso-bot[bot] commented 11 months ago

Greg Travis reports a new STANDUP for today (2023-12-22):

Progress: Cleanup, small bits It should be finished by 2024-01-02.

Next Day: literally anything else

enso-bot[bot] commented 10 months ago

Greg Travis reports a new 🔴 DELAY for today (2024-01-02):

Summary: There is 1 days delay in implementation of the Support on_problems=Problem_Behavior.Report_Warning and Map_Error wrapping in Vector.map (#8371) task. It will cause 0 days delay for the delivery of this weekly plan.

Delay Cause: many unforseen complications

enso-bot[bot] commented 10 months ago

Greg Travis reports a new STANDUP for today (2024-01-02):

Progress: Review; remove No_Wrap constructor; full docs; default test utils to unwrap_error=True It should be finished by 2024-01-03.

Next Day: benchmarks

enso-bot[bot] commented 10 months ago

Greg Travis reports a new 🔴 DELAY for today (2024-01-03):

Summary: There is 1 days delay in implementation of the Support on_problems=Problem_Behavior.Report_Warning and Map_Error wrapping in Vector.map (#8371) task. It will cause 0 days delay for the delivery of this weekly plan.

Delay Cause: many unforseen complications

enso-bot[bot] commented 10 months ago

Greg Travis reports a new STANDUP for today (2024-01-03):

Progress: Benchmarks It should be finished by 2024-01-04.

Next Day: join/null

enso-bot[bot] commented 10 months ago

Greg Travis reports a new 🔴 DELAY for today (2024-01-04):

Summary: There is 1 days delay in implementation of the Support on_problems=Problem_Behavior.Report_Warning and Map_Error wrapping in Vector.map (#8371) task. It will cause 0 days delay for the delivery of this weekly plan.

Delay Cause: many unforseen complications

enso-bot[bot] commented 10 months ago

Greg Travis reports a new STANDUP for today (2024-01-04):

Progress: Benchmarks It should be finished by 2024-01-05.

Next Day: join/null

enso-bot[bot] commented 10 months ago

Greg Travis reports a new 🔴 DELAY for today (2024-01-08):

Summary: There is 4 days delay in implementation of the Support on_problems=Problem_Behavior.Report_Warning and Map_Error wrapping in Vector.map (#8371) task. It will cause 0 days delay for the delivery of this weekly plan.

Delay Cause: many unforseen complications

enso-bot[bot] commented 10 months ago

Greg Travis reports a new STANDUP for today (2024-01-08):

Progress: Benchmarks It should be finished by 2024-01-09.

Next Day: probably more benchmarks

enso-bot[bot] commented 10 months ago

Greg Travis reports a new 🔴 DELAY for today (2024-01-15):

Summary: There is 1 days delay in implementation of the Support on_problems=Problem_Behavior.Report_Warning and Map_Error wrapping in Vector.map (#8371) task. It will cause 0 days delay for the delivery of this weekly plan.

Delay Cause: was out sick last week

enso-bot[bot] commented 10 months ago

Greg Travis reports a new 🔴 DELAY for today (2024-01-16):

Summary: There is 2 days delay in implementation of the Support on_problems=Problem_Behavior.Report_Warning and Map_Error wrapping in Vector.map (#8371) task. It will cause 0 days delay for the delivery of this weekly plan.

Delay Cause: task is also to fix inconsistencies