Multiple outputs and inputs per cell

fonsp commented 4 years ago

A cell input like:

🍕 = 1
🥞 = 2

should not be legal, because it is not a single expression. This might be unexpected (and part of the learning curve) for new users, and we could accept it without breaking the paradigm by:

Splitting up the expression into two cells.
Giving the cell two outputs, which are stacked on top of each other with a clear division between them.

A nice hybrid solution is to do the first option, and to design the editor layout in such a way that all cells fluently follow each other. (If code is folded).

Note: The cell

🍕 = 1; 🥞 = 2

which is syntactic sugar for

begin
    🍕 = 1
    🥞 = 2
end

is a single expression, which is not an assignment, but an evaluation that performs two global assignments. Parsing this cell should be legal (as a single expression), but running it might give an error (depending on what we decide to do with globals).

malyvsen commented 4 years ago

I don't see a reason why this shouldn't be legal from a paradigmatic point of view. I think this should be equivalent to two cells, one doing x=1, and the other y=2 (no emoji keyboard yet, sorry). It's not only more intuitive for new users, but also convenient, allowing to eg. bulk show/hide some related statements by showing/hiding a code cell.

malyvsen commented 4 years ago

From a "how do we do this" point of view: I'd have variables rather than cells be the nodes of the dependency graph. Each cell can then execute imperative code; once it's done, the global variables it set are checked for changes, and changes are propagated down the graph.

fonsp commented 4 years ago

Internally splitting it up into two cells is a cool solution! About having the core track variables instead of cells: not all cells define variables - some are just expressions with output, package imports, etc. so you would have to track all of them too.

But let's leave this for later?

malyvsen commented 4 years ago

More general then - we should track expressions. But yes, makes sense to leave this for later, and for now have 1 expression per cell.

fonsp commented 4 years ago

The issue of multiple expression per cell is solved: it is not legal. (To be clear, you can wrap multiple top-level expressions into a single begin ... end or let ... end block.) The question is now how to show a nice error to the user (#39).

We might re-open this one day, by making multiple expressions return multiple outputs.

jebej commented 4 years ago

I'm sure this must have been discussed already, but why not allow multiple lines, and then just do the equivalent of implicitly wrapping in a begin ... end block, instead of having the user do it manually?

This would not require having multiple outputs, and seems to make sense.

fonsp commented 4 years ago

Hey!

This is a good question, it comes up often and I have spent some time thinking about it.

Your suggested solution might be implemented some day, but I'd like to keep the current behaviour for now. The main reason is:

Reactive notebooks work best if you split your code into multiple cells.

For example, if you have a single cell defining 20 parameters for your model, then changing one parameter will cause the re-evaluation of every cell that (indirectly) depends on any parameter. This might be slow, making it less fun to play with your model (disaster!), and it's confusing - you feel like reactivity is not doing what it should do.

Solution

So you should make as many cells as you can! Problem solved? Well not quite, we get the feedback that cells can take up too much vertical space, and is more cumbersome to work with. I have implemented a lot of features to make this experience better, and I have more plans, notably this one. Already implemented are:

decreased vertical spacing between cells.
include the top-level assignee x in the case of x = 123 in the output - Julia only ouputs the value. This way you can hide the code and still see what variable is defined.
keyboard shortcuts to jump between cells (PageUp and PageDown).
selecting multiple cells to move/fold/delete them all at once.

Two more arguments

Like I mentioned before, it might be confusing that changing

a = 1
b = 2

to

a = 1
b = 345

will also re-run all cells that depend on a. I feel like explicitly writing those into a single block makes this more clear.

Also, I have a more geeky argument: Julia is a Lisp - each code block has a value. I think this is really charming, and for this reason, a cell in Pluto is a unit: one cell has one expression, returning one object (hence displaying one thing).

But the main thing is: I want UI to motivate you to split your code into cells - it's still WIP but I think it will be nice.

StefanKarpinski commented 4 years ago

What if you took a page out of the REPL and instead of using different keys for evaluation versus newline, just used enter for both and as soon as the expression is complete, you evaluate? That would make it technically difficult to even enter two different expressions in a single block. And if someone does manage to do this (say but cut-and-paste), then you could just split the cell automatically into two.

shashi commented 4 years ago

I love @StefanKarpinski's idea, what a great way to turn this situation into an opportunity!

I find ctrl+enter shift+enter really frustrating, and a straight up REPL like experience would be amazing!!

StefanKarpinski commented 4 years ago

Note that this works in Julia but not in Python because in Python you can't tell when an expression ends. Fortunately, Pluto isn't trying to be a frontend for every language, so it doesn't have to worry about this 😁

fonsp commented 4 years ago

Very nice! I was stuck on a similar idea, but this is realistic and cool.

How about it doesn't run, but just creates and focuses a new cell? (Return to run makes me a bit scared in the REPL - Return is no longer a no-consequence button.) Instead, you have the shortcut Ctrl+S to submit all changes, into a single reactive run (there's also a button for it, but it's still a bit hidden) - so you would ENTER a couple of times to create some cells, and then submit+run them all at once (explicitly).

Also nice would be to select multiple cells and then wrap them together into a let or begin block.

And doing arrow-up/arrow-down should jump between cell borders, like pageup+pagedown right now.

Exciting!

StefanKarpinski commented 4 years ago

What's the danger of evaluating code? I never have this worry in a normal REPL? Is it that one evaluation might trigger a lot of computation because of the reactive part?

fonsp commented 4 years ago

I'd like to try it of course, but I mentioned it because unpredictable (reactive) evaluation is one of the common concerns for Pluto users (#13 #116 #229 #298).

ppalmes commented 4 years ago

this is really ridiculous to not be able to run multiline codes in a cell without placing them in begin...end. at least have a config that this is possible. it is quiet obvious we group codes in the same cell.

ppalmes commented 4 years ago

even the generated code with begin...end blocks are not really the default way you write code.

ppalmes commented 4 years ago

and can the generated code be ordered correctly based on the graph deps? having implicit ordering without the physical ordering is very misleading or confusing.

jebej commented 4 years ago

Maybe this issue should be renamed to be about multiple inputs per cell, as it seems that the discussion has shifted more in this direction.

marius311 commented 4 years ago

The suggestions here are interesting and may be worth prototyping/exploring, but to me it seems like it basically boils down to how much credit you want to give the user, and right now I see things leaning towards not giving the user much credit (e.g. "hey user, you're not splitting things into enough cells because you don't understand that hurts reactivity, so we're going to do it for you", or "hey user, you're not going to realize chaging a=1; b=1 to a=1; b=345 is going to re-evaluate cells that depend only on a, so we're going to split them for you").

I would really encourage the opposite, give the user power, just make sure the rules are clear. The solution which corresponds to that to me is clear: implicilty wrap every cell in a being ... end. I've yet to see a technical reason why this is not possible (apologies if I've missed one), its all based on trying to be opinionated. But if the users wants to write,

a = 1
b = 2

in a single cell, why should Pluto make it impossible or at least annoying for them to do so (manually wrapping in begin .. end is annoying, as is clicking the button to do it)? Sometimes lines of code make sense grouped together, and a cell is a perfect visual indication of a grouping. And its straightforward to understand dependent computation is per-cell, so if the user wants to put them in the same cell, I'd say let them.

ppalmes commented 4 years ago

exactly. why even to have this long thread to explain the importance of grouping logical blocks in a cell.....

fonsp commented 4 years ago

@ppalmes Please give us space to talk about this subject in a friendly way - your comments here and on slack are inappropriate.

ppalmes commented 4 years ago

sorry. i’m just frustrated because it looks a very good package if the discussion is more open-minded instead of dismissing our point of view. thanks for taking into consideration the main point of discussion.

jebej commented 4 years ago

I would like to reiterate the desire to allow for multi-line input without begin and end. I keep seeing notebooks (eg. many in the computational thinking class) that have cells with a begin block.

Clearly it is convenient, useful and arguably plain old sensical to let people to write more than one line of code in a single cell.

The argument that notebooks might get less fun if they become slow is trumped (IMO) by the fact that it is even less fun to have to write begin and end for every other cell.

Regarding the arguments above:

For example, if you have a single cell defining 20 parameters for your model, then changing one parameter will cause the re-evaluation of every cell that (indirectly) depends on any parameter.

This might seem like an issue in theory, except that in practice I would expect parameters defined in a single cell to be logically related such that if another cell depends on one of the parameters, then it probably depends on at least some of the other parameters too. Conversely, I would not expect someone to define a bunch of parameters in a single cell that individually apply to different cells.

will also re-run all cells that depend on a. I feel like explicitly writing those into a single block makes this more clear.

In this example, I would expect the cell that depends on b to also depend on a. If that's not the case, then yes, you probably wanted to define a and b separately (since they are not logically related).

jebej commented 4 years ago

We shouldn't let a hypothetical-worst-case performance issue dictate how we structure code given how important structure is to our reasoning process.

infogulch commented 3 years ago

I sympathize with the very real UI problem of displaying multiple outputs. And I like the idea of encouraging users to use the tool correctly, while giving them choice.

A statement like 🍕 = 1; 🥞 = 2 serves both of these concerns neatly; the user is suppressing the output of the first statement making the UI problem go away, and since they opted-out explicitly it's ok to let them have that choice.

From that perspective I think this should also be allowed:

🍕 = 1;
🥞 = 2

This also caters to both concerns: The UI problem is obviated because there is still only one statement with output, and allowing this form gives the user a choice of how they interact with the tool.

StefanKarpinski commented 3 years ago

What was wrong with the idea of just evaluating the cell and if the user presses enter when the input expression is complete? That seems like it solves this problem while also simplifying how input is done since one doesn't need to learn two different kinds if "enter" (plain enter versus "shift enter"). Then this issue simply doesn't come up since two separate expressions in one input cell becomes impossible to input in the first place.

ederag commented 3 years ago

This is a good idea, indeed.

But it does not solve everything, as there are plenty of legitimate cases where several complete expressions must be entered in a single cell. For instance:

group struct definition and constructors (#732),
array creation followed by modification of a few elements,
plots creation followed by tweaking.

The last ones require a cell to work as a unit (multiple outputs would not make sense).

My preferred stance would be

Enter evaluates the cell if the output is complete, as you suggested above.
Shift+Enter or Ctrl+Enter to prevent evaluation and just enter a newline
multiple lines are legit and do not require an explicit begin ... end.

But I understand why decisions about this are postponed.

StefanKarpinski commented 3 years ago

That does seem like a sensible approach: most users never need to think about anything besides enter which will just work 99% of the time and create separate cells as is preferred. For users who do need multiple inputs in a single cell, they can either do the begin/end thing or learn about shift+enter.

ppalmes commented 3 years ago

it is great that this topic is still active. logical blocks just make sense because it is like writing multi-sentences vs multi-paragraph. the building block should be a paragraph and a sentence is the trivial case, not the rule. even adding ‘begin … end’ seems redundant if you have space separator or cell separator to signal different logical blocks.

cossio commented 3 years ago

About having the core track variables instead of cells: not all cells define variables - some are just expressions with output, package imports, etc. so you would have to track all of them too.

What about viewing expression cells that don't explicitly define variables as defining an Output[cell] variable which contains all outputs. Mathematica (for example) does this.

lungben commented 3 years ago

I think it would be better to use the logging functionality or capturing of stdout (for println, etc.) for this purpose, see #437

infogulch commented 3 years ago

My suggestion is to split this feature request into two separate requests: 1. Allow multiple inputs, 2. Allow multiple outputs. Then note that allowing multiple inputs incurs zero cost related to redesigning the architecture and UI, and is a pure UX benefit. The feature request to add multiple outputs can be considered separately.

Allowing multiple inputs can be implemented conceptually very cleanly. First, note that allowing multiple inputs on the same line is already valid today, as stated in the opening post:

🥗 = 0; 🍕 = 1; 🥞 = 2

Conceptually, all but the last expression has it's output suppressed with ;, and therefore the block as a whole has only one output. I propose extending this conceptualization to multiple lines. That is, multiple input lines are allowed as long as all expressions except the last have their output suppressed, meaning that the block as a whole has only one output:

🥗 = 0;
🍕 = 1;
🥞 = 2

The biggest benefit of this design imo is that the user is very explicitly suppressing the output of prior expressions, and will not be surprised that multiple outputs are not displayed. Combined with ederag's comment that describes valid use-cases for multiple input lines (all of which I encountered the first time I used Pluto) this seems like a slam-dunk.

ParadaCarleton commented 1 year ago

Any further work on this? I think this is one of the main things holding me back from using Pluto instead of Quarto.

schlichtanders commented 11 months ago

I'm sure this must have been discussed already, but why not allow multiple lines, and then just do the equivalent of implicitly wrapping in a begin ... end block, instead of having the user do it manually?

This would not require having multiple outputs, and seems to make sense.

Motivation

I am currently in the process of extending Pluto to other languages (Python and R) where I also decided to allow for multiple lines. I return the last output, like what Jupyter would be doing, and what I think people expect.

I would like to push this forward that Julia does not look more clunky than Python/R by always throwing errors if someone just uses multiple lines.

My reasoning for auto-wrapping into begin-end block

Reason one: Following the users intention of writing a single cell. The error gives two options: Splitting it up into multiple cells or wrapping it into a begin-end block. As a user, I apparently was writing a single cell, so my fallback intention is that I want a single cell.

Reason two: Following users intention of jupyter notebook output. As said above, I am very happy to follow Jupyter output logic for Python and R. For Julia it would also mean returning the last output value only.

fonsp / Pluto.jl