Strict Execution - Githubissues

dmadisetti commented 4 weeks ago

📝 Summary

Closes out #1142 and #1509

This makes the kernel more deterministic by deep copying between cell passes, and ensuring reference existence before execution. It also restricts the scope of execution variables. Additionally, it introduces mo.zero_copy and mo.shallow_copy, which indicate to the kernel not to copy values (e.g. for very expensive variables).

Previous behavior:

New Behavior:

With zero_copy

Missing:

[x] More unit tests (specifically ast / executor)
[ ] UI love for Error on missing references
[x] Documentation

🔍 Description of Changes

This ended up being a much bigger change than expected. I think it still needs some more unit tests, but wanted to get some eyes on it / see if it passes CI before doing anything more.

It's a pretty invasive change adding an "Executor" class and a copy module. Also had to get block level refs, so it touches AST. Also had to get state working, so had to tinker with that. Took some license with the structuring.

Execution is under "experimental" since I think this needs to be used a little before stability guarantees. currently it does not support script or direct cell runs.

📜 Reviewers

@akshayka OR @mscolnick

vercel[bot] commented 4 weeks ago

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
marimo-docs	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Jun 16, 2024 6:07pm
marimo-storybook	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Jun 16, 2024 6:07pm

dmadisetti commented 4 weeks ago

??? Unsure why __future__ isn't working for 3.8, will investigate later

akshayka commented 4 weeks ago

@dmadisetti This is a very interesting PR, and the implementation is quite clean.

I have one concern, though — this change will put marimo in a situation where some notebooks only work when execution mode is strict, whereas others will only work when execution mode is relaxed. This would make sharing marimo notebooks more difficult, since users would need to somehow know which runtime configuration to use.

For this reason I'm not fully convinced that the benefits are large enough to warrant the complexity. But it's possible I am missing something ... Beyond the matplotlib example, how do you envision this feature being used?

An alternative we could consider: perhaps this feature could be a good motivating candidate for the extension system we talked about.

I know we discussed this feature in #1142, but I was under the impression that that discussion was for app.run() and exports, not for the core notebook experience.

dmadisetti commented 4 weeks ago

All relaxed notebooks should be able to run strict notebooks, but not all strict notebooks will be able to run relaxed ones (maybe there's an example I'm not seeing?).

In the cases where notebook behavior is different, in the "relaxed" case- it's indeterminant (For instance the plot example, I'm not sure why $x^3$ ran before $x^2$). Strict mode can still force "relaxed" behavior with zero_copy, and a strict mode user should take that into account if writing a notebook for a general audience. Being opt-in, comes with the understanding that the notebook must be written more carefully

I think this has come from some of my frustrations that marimo seems so close- and then I'll get an unexpected mutation- at which point it's just easier to restart the kernel. This solves that issue.

I've been trying to write some of my code with this kernel, and it's been interesting. I'm catching bugs I don't think I normally would with marimo- I even found a marimo bug with regard to exceptions #1571. It's just a step up wrt reproducibility

But yeah, it is a little invasive, but I don't think there's much overhead to normal operations. If it's too different, it could be the basis for tokarip

akshayka commented 3 weeks ago

Thanks for explaining; the step up in reproducibility makes sense, and it's cool you were able to find a bug in marimo using this.

There are cases in which I've used mutations when making UIs. There was no indeterminism there (though of course the notebook was stateful), but those notebooks were written before we had mo.state.

The RAM overhead could be very large when working with numerical data, but I suppose since it's opt-in the user knows what they're signing up for.

This is definitely useful ... Let me think a little bit more about whether this goes in marimo core (perhaps it could), or if it's the basis for tokarip?

dmadisetti commented 3 weeks ago

For UI and state, there's a check that makes them automatically "no copy". but it won't catch cases like:

states = [State(), State()]

since these are nested in a data structure.

So 1) the strict kernel could reimplement deepcopy to do the type check or 2) stay as is, and much more simply, we can make State, UI Elements, and virtual files not deepcopyable to catch these edge cases, which seems correct beyond this PR (I don't think copy falls under their intended behavior as is—for example, I found that the ref counter with virtual files breaks if the file variable is copied).

However, you can still get the same behavior with:

states = mo.copy.zero_copy([State(), State()])
# or even states = mo.copy.shallow_copy([State(), State()]) in this case

marimo._ast objects are also whitelisted. LMK if you can think of any others.

As is, I would not recommend making this as anything but experimental until it has had some actual use. I keep finding weird cases like "calling a lambda defined in another cell but using a private variable in scope and another variable in defined in a 3rd cell" will fail (this actually passes now, but just as an example).

Tests are passing after adding strict_kernel to all_kernels, and I have a local commit that adds the runtime/ state tests to be run with strict_kernel (there's one failing case based on required_refs resolution that I want to fix before I push)

Re memory overhead: This will never be more than x2 (the reference copy and the working copy), with the idle state being the same amount of memory. But this is also why I added zero_copy, since some data is too big to reasonably be copied every reference.

But yes, this could feasibly be extracted to something like tokarip, if the visitor changes and execution register stay

I don't know this is obviously a little opinionated, maybe we should find some willing beta testers

dmadisetti commented 3 weeks ago

Hmm. Weird. Failing with timeout on 3.11 which is also suspiciously slow with trivial changes too: https://github.com/marimo-team/marimo/actions/runs/9437984531/job/25994902883?pr=1583

Edit: The other overhead looks reasonable given I added a 100+ tests? But even those are basically instaneous testing directly. Maybe just got unlucky

akshayka commented 3 weeks ago

Failing with timeout on 3.11 which is also suspiciously slow with trivial changes too:

Timeout is okay, 3.11 runs code coverage, so sometimes times out.

dmadisetti commented 3 weeks ago

Quickly ran through the tutorials. They all work! Except this by design

dmadisetti commented 3 weeks ago

Cool! Good comments. I hope my documentation isn't too overwrought. Today is the first day I used this branch without actively tinkering with it so

Notes from usage:

[x] NameError catching is a little too aggressive right now. I was developing from a module and the way the error is currently thrown obfuscates the stack trace in a way that was not ideal.
[x] StopError wasn't producing the correct UI behavior? Same area of code that needs refactor / checking

Misc todos

[x] Clean up type checking in execution—currently, there are a bunch of random imports. I think this can be done more elegantly.
[ ] investigate required_refs from function args
[ ] (maybe) expand required ref precedent to ignore np.ndarray[number], and make it easily extensible for more types.

A different PR/ needs discussion:
- [ ] prevent unintentional clone of state, files, and UI objects. These objects aren't meant to be cloned and might lead to unexpected behavior if they are cloned- so by either throwing an error in __copy__ or __reduce__ (or just removing the relevant magic functions), we can stop inadvertent improper usage.

I think unit tests are OK- and I'm going to hold off for external documentation until/ if this is exposed.

akshayka commented 3 weeks ago

All tests passing! 👀 🎉

dmadisetti commented 3 weeks ago

More misc:

[ ] revisit exceptions

Maybe over the weekend, just noticed this while working

dmadisetti commented 2 weeks ago

Exceptions now look good

dmadisetti commented 2 weeks ago

Sorry for the massive PR. Even though most of it is comments and tests, I think I could have broken it up.

Will try to provide quicker turn around for additional comments since I know this is now blocking

mscolnick commented 2 weeks ago

(its not blocking, I can still work off my branch) thanks for the effort on this one! definitely a tricky problem to solve

github-actions[bot] commented 2 weeks ago

🚀 Development release published. You may be able to view the changes at https://marimo.app?v=0.6.20-dev7

marimo-team / marimo

Strict Execution #1580

📝 Summary

🔍 Description of Changes

📜 Reviewers