fonsp / Pluto.jl

🎈 Simple reactive notebooks for Julia
https://plutojl.org/
MIT License
4.94k stars 285 forks source link

include file containing structs #115

Open karlwessel opened 4 years ago

karlwessel commented 4 years ago

It currently is not possible to reevaluate a cell that includes a file which defines a struct.

For example when I have a file mystruct.jl that contains

struct A end

and include it in a notebook with

include("mystruct.jl")

it works when I execute the cell the first time, but the second time I get the following error message:

LoadError: cannot assign a value to variable workspace10.A from module workspace11
top-level scope at mystruct.jl:1
include(::Module, ::String) at ./Base.jl:377
include(::String) at .julia/packages/Pluto/3bf5m/src/react/WorkspaceManager.jl:65
in expression starting at mystruct.jl:1
fonsp commented 4 years ago

This can be fixed! And!

The workspace cleaning could be made more general, and also clean up after code from another file. For example, when your included script adds an extension to Base.sqrt, this could be deleted by Pluto when you remove the include. This would set the stage for notebooks-importing-notebooks reactivity.

fonsp commented 4 years ago

Related, but not quite the same, is that right now you have to restart your workspace (i.e. restart Pluto) when you change a package and want to import it again. 👉 #78

karlwessel commented 4 years ago

If this is more complicated to solve and is only a long term goal should it maybe display a more informative error message mentioning that this is currently not supported and what a workaround may be?

fonsp commented 4 years ago

Hm yes perhaps that's a good idea. Maybe it can be done with the existing error rewriting mechanism - give it a try if you like!

karlwessel commented 4 years ago

I tried to implement this using the rewrite mechanism but can't get the regular expression to work (I have no experience with javascript):

{
    to: "Reloading cells that include files defining structures does not work yet. <Placeholder, replace with something better>",
    from: RegExp("/LoadError: cannot assign a value to variable workspace\\d+\\..+ from module workspace\\d+/")
}
fonsp commented 4 years ago

Hm I see, maybe you can replace "LoadError: cannot assign a value to variable workspace" with "<nice error message>\n\nLoadError: cannot assign a value to variable workspace"

This would still show the original error, but that way you don't have to use a regular expression :)

karlwessel commented 4 years ago

Ok, that is a solution that would work.

Does the replace method not support regexes? According to the documentation I thought it would.

fonsp commented 4 years ago

It should! Try it out on MDN.

karlwessel commented 4 years ago

Ah, that helps! Thanks!

fonsp commented 4 years ago

I spent some time on this and it is a bit more complicated than I thought (edge cases!). Here's an idea:

The problem is that include(filename) "takes that code and runs it here", so it puts things in your scope. This makes things very difficult, because Pluto's code analysis doesn't know what it will place in the scope, and the Pluto runtime is built on the assumption that we know exactly what every cell does to the scope.

The same problem exists with using, but the solution there is to repeat every using X call from anywhere in the notebook at the start of every cell. With using there is practically no cost to this (Julia has a cache), but that's not the case with include. (Although we still have #24)

So my idea is to make include undefined (or give a deprecation warning), and instead, you have ingredients which is just like include, except it wraps the code in a module!

So instead of

inlcude("./wow.jl")
a_thing + 1

you have to write:

other_nb = ingredients("./wow.jl")
other_nb.a_thing + 1

So essentially it's import instead of using, but for scripts. There's more code, but it is explicit, which I love, and it fixes this issue (because the included code lives in a separate scope), which is nice.

It also means that the imported script can't access things from the notebook's scope. But this limitation is part of the solution! Pluto wouldn't know which cells need to be run before an include.

fonsp commented 4 years ago

There is another reason to create our own include: importing with replacements! This is something very cool from the observablehq universe. Say that you have a notebook some_analysis.jl that loads in a dataset called data and then creates a beautiful plot. You could then create a second notebook and re-use that plotting code, but with your own data:

my_data = rand(10)
other = ingredients("./some_analysis.jl", data=my_data)
other.pretty_graph

And the reactivity "goes through the notebooks" - this would work straight away (because ingredients runs again when you change my_data), but it would be even nicer if we could only run the affected cells inside some_analysis.jl, instead of the whole script. I wrote a bit about how that could work here.

Our version would be 1% cooler than observablehq because ingredients does not need to be at top-level. It's also 90% less cool because we use filenames instead of notebook links.

fonsp commented 4 years ago

Paste this into your notebook to try it out:

function ingredients(path::String)
    # this is from the Julia source code (evalfile in base/loading.jl)
    # but with the modification that it returns the module instead of the last object
    name = Symbol(basename(path))
    m = Module(name)
    Core.eval(m,
        Expr(:toplevel,
             :(eval(x) = $(Expr(:core, :eval))($name, x)),
             :(include(x) = $(Expr(:top, :include))($name, x)),
             :(include(mapexpr::Function, x) = $(Expr(:top, :include))(mapexpr, $name, x)),
             :(include($path))))
    m
end
fonsp commented 4 years ago

One thing that would be missing is the equivalent of:

import M: a, b, c

which you would have to write as:

N = ingredients("./somewhere.jl")
a, b, c = N.a, N.b, N.c

or you can use import! But in two steps:

N = ingredients("./somewhere.jl")
import .N: a, b, c

We can also just rewrite code to create whatever syntax we like! I'm thinking of import "./somewhere.jl", but now it is unclear how to access that module from code.

fonsp commented 4 years ago

@karlwessel Have you had a chance to try it out yet? Your eye for edge cases would be very useful!

karlwessel commented 4 years ago

I have tried it out and it works when the notebook is run using Pluto. However the script does not work as a standalone version anymore. It would be really good if that could be fixed since one of the nice things of Pluto notebooks is that they are executable as a standalone script without Pluto.

I'm not sure about deprecating include. Wouldn't it be enough to recommend users to use ingredients instead in the error message that is currently displayed when include is called twice?

I haven't used it yet in a real case scenario but if it works it will make experimenting with own types much easier!

And sorry for taking so long, I'm currently not much motivated to do real stuff :).

fonsp commented 4 years ago

Thanks karl, hope you're doing well! I looked at it again today, and oops it turns out that Pluto does not save your notebook when you use "Submit all changes" or the keyboard shortcut Ctrl+S. This is fixed now: https://github.com/fonsp/Pluto.jl/commit/07d6b80c47cd0639b03fc27d2e2e3a59359555cb

Maybe this is why running the notebook outside of Pluto did not work? It works for my basic test case.

fonsp commented 4 years ago

My test case:

https://gist.github.com/fonsp/6fa976ff427079587182c876a02a8d73

Download both into the same directory.

fonsp commented 4 years ago

One modification that I would like to make before releasing is that that I want it to support URLs, and this should be promoted. Maybe local_ingredients(path) and ingredients(url).

Although the metaphor breaks down here because ingredients should sound better.

karlwessel commented 4 years ago

Ok, i found my Problem. For testing I used the MWE from my first post in this issue and just replaced include with ingredients but did not assign the return value of ingredients to a variable. That only seemed to work because the notebook was executed once before with include and therefore the structure A was still in scope.

when I use M = ingredients("mystruct.jl") and t = M.A() everything works fine in the notebook and when executed from a script. :+1:

karlwessel commented 4 years ago

About using this to include notebooks with replacement:

I am not sure how useful that would be for me, maybe I hadn't really had a use case for that yet. Comparing two different datasets might be useful, yes.

However: what I think is similar to notebooks with replacements is having collapsible function calls. If i want to know what a certain function call does I could click on it and it would show that function call as a subnotebook where I can see which variable has which state.

It would be enough to support this for functions defined in a notebook and not for arbitrary functions.

For that to work some automatic julia to notebook conversion like in issue #132 might be necessary.

fonsp commented 4 years ago

A cool idea! @dralletje mentioned something similar, and I wrote about it here: https://www.notion.so/Notebooks-inside-notebooks-9e99d381c3014e40b310d4846c764a15 Let me know what you think!

About including with replacement: I have been told that this is a common-ish use case for notebooks: https://www.notion.so/Running-notebooks-as-scripts-48cc50347336412796ca7a7527514f73 but you're right that it's mostly for fun

karlwessel commented 4 years ago

Already read those, that's why I thought "That could be cool for debugging functions by looking into their local variables for specific function calls." :)

I would comment directly at those notion posts but am still somewhat reluctant to create yet another account at some site xD.

pbouffard commented 3 years ago

Moving comment here at request of @fonsp :

Could this be extended to include watching external files on which the notebook is dependent? A lot of workflows involve generating a file with some external tools that is the primary input for a notebook, and/or use files as intermediate steps. Not being able to watch these "breaks" reactivity in a sense.

It looks like a basic version of this could be implemented using Clock from PlutoUI.jl, but AFAICT that would trigger downstream cell updates whether or not the file contents have changed. Actually more generally can Pluto prevent this kind of needless downstream updates and/or is there a way to implement such a thing? Even if it wasn't automatic--it would be nice to be able in particular to prevent longer running cells from recomputing, or at least to be able to mark certain cells for manual recomputation. Wow, this comment has gone really OT 🙃.

fonsp commented 3 years ago

Here's the new version of ingredients that supports URLs:

https://binder.plutojl.org/v0.14.7/open?url=https%253A%252F%252Fraw.githubusercontent.com%252Ffonsp%252Fdisorganised-mess%252Fmain%252Fingredients.jl

dralletje commented 3 years ago

I love ingredients for sure, but something similar can be done with a module in your cell:

# ╔═╡ fa79030e-0bf0-11eb-14ff-33e567c100d9
module FileWithStruct include("./file_with_struct.jl") end

# ╔═╡ 1e09dcf8-0bf1-11eb-01e5-9ff7e55f0b18
sprint(dump, FileWithStruct.X) |> Text

This will not work for urls, this won't work with replacements if they ever get added to ingredients, this just does include(...) but better :) However, I still think a bare include should work, just like have using "working", not sure how much we can nudge people into using the module or ingredients syntax.

samuela commented 3 years ago

The ingredient workaround doesn't work quite the same as include as far as I've found. In particular it can change the repr of structs. I have a file foo.jl with some tests in it:

# ...
let b1 = Box([-1.0, -1.0], [1.0, 1.0]), b2 = Box([0.0, 0.0], [2.0, 2.0])
  # I don't want to bother defining == on Boxes.
  @test repr(intersect_boxes(b1, b2)) == "Box([0.0, 0.0], [1.0, 1.0])"
end

It works just fine to include it in scripts/the julia REPL, but with ingredients in Pluto.jl I get an error:

Test Failed at /Users/skainswo/dev/research/julia/boxes/src/boxes.jl:19
  Expression: repr(intersect_boxes(b1, b2)) == "Box([0.0, 0.0], [1.0, 1.0])"
   Evaluated: "Main.boxes.jl.Box([0.0, 0.0], [1.0, 1.0])" == "Box([0.0, 0.0], [1.0, 1.0])"
ERROR: LoadError: LoadError: There was an error during testing
in expression starting at /Users/skainswo/dev/research/julia/boxes/src/boxes.jl:16
in expression starting at /Users/skainswo/dev/research/julia/boxes/src/maze_pluto.jl:35
samuela commented 3 years ago

Here's the new version of ingredients that supports URLs:

https://mybinder.org/v2/gh/fonsp/pluto-on-binder/master?urlpath=pluto/open?url=https%253A%252F%252Fgithub.com%252Ffonsp%252Fdisorganised-mess%252Fblob%252Fmaster%252Fingredients.jl%253Fraw%253Dtrue

There seems to be an error showing the notebook:

The notebook from https://github.com/fonsp/disorganised-mess/blob/master/ingredients.jl?raw=true could not be loaded. Please report this error!
jamblejoe commented 1 year ago

I was pointed here from a Pluto notebook when rerunning a code cell with an include statement. Any progress on the issue here?