fonsp / Pluto.jl

šŸŽˆ Simple reactive notebooks for Julia
https://plutojl.org/
MIT License
4.91k stars 284 forks source link

Assign a variable more than once #276

Closed fonsp closed 3 years ago

fonsp commented 3 years ago

From @ChetanVardhan:

The fact that you can't assign a variable in Pluto more than once, kinda makes all the Jupyter notebooks non working when converted. Any chance that assigning more than once would be allowed from top to bottom, like normal Julia code?

Originally posted by @ChetanVardhan in https://github.com/fonsp/Pluto.jl/issues/182#issuecomment-671046497

fonsp commented 3 years ago

This restriction is exactly what makes reactivity possible!

It is common for programming environments to impose/encourage a new restriction to make it easier for computers to help the programmer, and for other humans to reason about the code. For example, Dijkstra talked about how removing the GOTO statement would make software much easier for mathematicians to write proofs about. Functional programming shows how immutability can make software clearer. In some cases, computers can use "laziness" to even make your functional programs faster!

In the case of a reactive notebook, allowing multiple definitions would mean that there is no longer an obvious way for the computer to figure out in which order to run your cells. For example, if you write:

md"My name is $name"
name = "Alice"
name = "Bob"

then what should the output be?

fonsp commented 3 years ago

But perhaps you are trying to do one of the following things:

Run the same code multiple times with different inputs

In this case, you should write functions, and call the function multiple times, with different inputs.

Modify a 'state variable' accross multiple cells

Because Jupyter makes this possible, it is common for Jupyter notebooks to use this technique. In Pluto, however, it is not recommended to do so in Pluto. It makes reactivity confusing, and it makes your notebook harder to understand.

You can probably formulate your problem in a different way, such that it does not require a mutable state. Have a look here: https://github.com/fonsp/Pluto.jl/wiki/%E2%9A%A1-Writing-and-running-code

VarLad commented 3 years ago

Hmmm... since one has the ability to "execute"

This restriction is exactly what makes reactivity possible!

It is common for programming environments to impose/encourage a new restriction to make it easier for computers to help the programmer, and for other humans to reason about the code. For example, Dijkstra talked about how removing the GOTO statement would make software much easier for mathematicians to write proofs about. Functional programming shows how immutability can make software clearer. In some cases, computers can use "laziness" to even make your functional programs faster!

In the case of a reactive notebook, allowing multiple definitions would mean that there is no longer an obvious way for the computer to figure out in which order to run your cells. For example, if you write:

md"My name is $name"
name = "Alice"
name = "Bob"

then what should the output be?

Since one could execute a code cell in Pluto, whichever cell gets executed is decided by the user In the above example, if one clicks Shift + Enter on the name = "Alice", then My name is Alice If name = "Bob" cell is executed by the user, then My name is Bob Is something like that possibleļ¼ŸšŸ˜… When checking reactivity, just skip the lines where reassignment has been doneļ¼Like if name = "Alice" gets executed, just ignore name = "Bob" and vice versa. Is it possibleļ¼Ÿ

fonsp commented 3 years ago

But now the history of which cells were executed when becomes part of the notebook state! This is exactly what reactivity wants to avoid.

When you send someone a notebook, and they open your notebook, they should see exactly the same as you.

VarLad commented 3 years ago

But now the history of which cells were executed when becomes part of the notebook state! This is exactly what reactivity wants to avoid.

When you send someone a notebook, and they open your notebook, they should see exactly the same as you.

You're right on this one

VarLad commented 3 years ago

But perhaps you are trying to do one of the following things:

Run the same code multiple times with different inputs

In this case, you should write functions, and call the function multiple times, with different inputs.

Modify a 'state variable' accross multiple cells

Because Jupyter makes this possible, it is common for Jupyter notebooks to use this technique. In Pluto, however, it is not recommended to do so. It makes reactivity confusing, and it makes your notebook harder to understand.

You can probably formulate your problem in a different way, such that it does not require a mutable state. Have a look here: https://www.notion.so/Writing-and-running-code-7ddc40b7f1a24b809690954c373b20c8

Thats an awesome adviceļ¼I'll try applying it in the futureļ¼ThanksšŸ˜„

VPetukhov commented 3 years ago

It is common for programming environments to impose/encourage a new restriction to make it easier for computers to help the programmer, and for other humans to reason about the code

Your reactive idea is absolutely awesome! But the conflicts you mentioned could be resolved if cells are executed in the order they're written. And, to my understanding, the possibility to write code cells in an arbitrary order is exactly an example of the behaviour that should be discouraged. And if there is a restriction for the order, then multiple assignments of the same variable is not a problem: only the last assignment above the given cell matters for it. If I remember right, this idea is implemented in the Datalore notebooks. Are you sure that having unordered cells is more important than multiple assignments? :)

But perhaps you are trying to do one of the following things:

One more reason for re-assigning variables is usage of temporary values with short names. Let's say, I want to show and save multiple plots. It can be done with the following code:

using Plots
x_vals = rand(100)

begin
    plt = plot(x_vals)
    savefig(plt, "line.png")
    plt
end

begin
    plt = scatter(x_vals)
    savefig(plt, "scatter.png")
    plt
end

Of course, there are ways to write it without the plt variable, but there are thousands of such small examples where it's simply convenient to store the variable for some time. Moreover wrapping all such code to functions often brings too much overhead. At the very least because it's not always clear, which variables should be returned from this function for later use...

fonsp commented 3 years ago

For local variables there are two solutions:

let
    plt = scatter(x_vals)
    savefig(plt, "scatter.png")
    plt
end

and

begin
    local plt = scatter(x_vals)
    savefig(plt, "scatter.png")
    plt
end

Both do not create a global variable called plt. To me, this is good style.

About your suggested alternative: it is likely a matter of taste, but to me, this sounds like a step backwards. What I like about the paradigm is that it is declarative instead of imperative (at the notebook level), and that it encourages concepts from functional programming like immutability. These are limitations, but limitations that are known to make computer code easier to reason about. In the case of a notebook, it means that you can understand single cells without knowing about the structure (i.e. state) of the notebook.

fonsp commented 3 years ago

Specific examples like the one you posted @VPetukhov are really helpful for us to learn what the common difficulties with reactivity are! So feel free to send some more :)

VPetukhov commented 3 years ago

Thanks so much for the answer, @fonsp! The local variables solution should actually work! And your comment about declarative paradigm helped a lot to perceive the idea. Maybe it worth adding to the readme :) And I agree that this idea is quite extraordinary and we shouldn't step backwards. Though I still think that having code all around makes reading much harder. How about having a button "Order cells", which simply order the cells according to what you have in the .jl file? Or, even better, a corresponding checkbox. When it's set to "Ordered", the UI would ignore the "ā•”ā•ā•” Cell order:" part. It would allow to have ordering reversible painlessly. Alternatively/additionally, having a go-to-definition functionality would also help to navigate.

fonsp commented 3 years ago

Jump to definition is definitely a must! I've added an issue https://github.com/fonsp/Pluto.jl/issues/304

Oh interesting idea. The way I thought about it - the author gets to decide what order to put their cells in - if you want linear order, then you just write your notebook linearly. But what I found interesting is that you can move "helper functions" to the end of your notebook, if you (the author) think that their implementation is not relevant to understanding the code.

Do you think the "linear switch" would be useful for your own notebooks or when viewing notebooks by others?

fonsp commented 3 years ago

Another example

Before

begin
    x = 123
    apple = sqrt(x)
end
begin
    x = 123
    orange = x ^ 2
end

This is not legal, because both cells define a global x. Instead, you should use the let block:

After

apple = let
    x = 123
    sqrt(x)
end
orange = let
    x = 123
    x ^ 2
end

let is exactly like begin, except it has a scope. Variables defined inside a let block (like x) are only defined within the block.

In Julia, most control blocks (begin, let, try, if) have an implicit return: the value of the last subexpression is the value of the block itself.


So why does Pluto suggest a begin block and not a let block? Well... I'm not sure... I thought that this would be an easier transition for some. Wrapping in let is also a bit tricky, because you'd want the assignment to move outside of the block.

yoninazarathy commented 1 year ago

This was good discussion and very helpful. I would recommend that Pluto let users know about using let blocks in place of begin blocks in the error message. That is, the Multiple definitions for __varname__ error message can say Multiple definitions for __varname__, use a let...end block to enforce local scope if needed. Or something of that sort.