enso-org / enso

Hybrid visual and textual functional programming.
https://enso.org
Apache License 2.0
7.36k stars 324 forks source link

Dependency tracking between nodes is too coarse grained #11237

Open hubertp opened 1 week ago

hubertp commented 1 week ago

Consider a scenario described in the video below:

Screenshot from 2024-10-02 18-15-34

Yet, when deleting the node, we re-trigger execution of everything that has been be defined after the node in the code suggesting incorrect/too coarse grained dependency tracking info. Kazam_screencast_00013.webm

This issue has been reported in the past and claimed to be fixed, so marking as a potential regression.

hubertp commented 1 week ago

Self-contained project: New Project 2.enso-project.zip (remove .zip suffix before importing)

Working on a smaller reproducible case but so far this is only visible when including large files.

4e6 commented 6 days ago

The issue can be minified to the following example.

Example

main =
    x = 0
    y = 1
    y

To remove the first line x = 0 the gui uses the following edit:

TextEdit(Range(Position(1, 4), Position(2, 4)), "")

That removes characters x = 0\n

0|main =
 |
1|    x = 0
 |    ^^^^^^
2|    y = 1
 |^^^^
3|    y

Which touches the y on the second line that leads to invalidation of the whole y = 1 expression (and all the expressions depending on it)

Note

The engine cannot ignore y because it processes edits on the character level and not on the semantic level. And it cannot know that the y expression is not affected by the edit. The example can be

y = 1

Changed to

y = 12

The engine has to invalidate 1 expression, even though the edit only affects its boundary. Otherwise the y expression won't be recomputed.

Issue

The issue can be fixed by changing the text edit to only address the first line x = 0 instead of x = 0\n. @farmaazon can it be fixed on the gui side by changing the produced text edit?

hubertp commented 5 days ago

Interesting. I thought I had locally exactly that and couldn't see it. Glad to know that you managed to minimize it.

The issue can be fixed by changing the text edit to only address the first line x = 0 instead of x = 0\n . @farmaazon can it be fixed on the gui side by changing the produced text edit?

Wouldn't that mean that we are left with

main =

    y = 1
    y

as a result?

4e6 commented 5 days ago

You can remove a line with a line separator character, then there will be no gaps

farmaazon commented 4 days ago

Issue

The issue can be fixed by changing the text edit to only address the first line x = 0 instead of x = 0\n. @farmaazon can it be fixed on the gui side by changing the produced text edit?

It's not something trivial. Currently, we use an external lib for making fast and effective diffs. It has no option for "left-binding" of any changes. We could research for another library, or just reverse strings before making a diff (and then reverse all indexes).

The only question is: do we want it to be a "general rule"? For example, adding/removing element to list will make a diff [1<, 2>, 3] instead of [1, <2, >3] - are we ok with that? Probably yes, just making sure.

4e6 commented 4 days ago

Both cases are fine because they will result in vector invalidation. We don't care about individual vector elements because they are not cached (only the whole expression is).

kazcw commented 4 days ago

Left-binding diff would handle this case correctly, but no diff algorithm will always be right. A diff algorithm only has before/after information, and has to guess what really changed. You can see this in git line-based diffs--it's not uncommon for a line to be affected that wasn't logically part of the change, because using the line scored better to the diff algorithm.

We have a SourceDocument API that we use to sync AST updates to the source-code editor. Because it converts a tree change directly to a text change, it will always invalidate the correct nodes.

4e6 commented 1 day ago

Left-binding diff would handle this case correctly, but no diff algorithm will always be right.

I see, then we have to update our diff logic in the engine. I guess, I can implement special handling of line removal and fallback to the old algorithm in other cases.

kazcw commented 1 day ago

Left-binding diff would handle this case correctly, but no diff algorithm will always be right.

I see, then we have to update our diff logic in the engine. I guess, I can implement special handling of line removal and fallback to the old algorithm in other cases.

What I mean is SourceDocument is not a diff algorithm, it generates diffs from the AST changes directly. We should use it in the ydoc server to produce accurate diffs.