elm / compiler

Compiler for Elm, a functional language for reliable webapps.
https://elm-lang.org/
BSD 3-Clause "New" or "Revised" License
7.51k stars 660 forks source link

Proposal: special syntax for tasks #908

Closed evancz closed 8 years ago

evancz commented 9 years ago

So far we have a few proposals of how to write tasks in Elm. I will update this list to reflect the latest discussion:

The next question is, how do we run tasks and see the result? The two options seem to be:

I think some assessment metrics for all this are:

evancz commented 9 years ago

@Apanatshka made the very interesting point that we could allow main to have types like this:

main : Task x (Signal Html)
main : Task x (Signal Element)

My main concerns are (1) it forces tasks upon people sooner and (2) may make certain things impossible.

Lots and lots of programs are going to need a Mailbox and this would mean that the setup looks like this:

model : Model

view : Address Action -> Model -> Html

update : Action -> Model -> Model

main =
  async
    let
      mb = await mailbox
    in
      Signal.map (view mb.address) (Stream.fold update model mb.stream)

This means writing some very basic programs introduces people to the type Task x (Signal Html) which I think is pretty intense. It also is not clear how we'd send information from our Model out a port. If that is not possible, I think it is a show stopper for this idea.

On the plus side, it forces your view function to take Address as an argument, which is best practice.

rtfeldman commented 9 years ago

It's true that this forces Tasks upon people sooner, but I think they're going to want them pretty quickly.

JS tutorials (and programming language tutorials in general) tend to start using effects almost immediately. It's a peculiarity of FP languages that they tend to avoid how effect handling works for awhile...and I'm not sure it's a good thing.

Another thing worth noting is that a perfectly sound teaching technique is to say "don't worry about : Task x (Signal Html) yet; we'll come back to that later." Or even omit the type signature altogether, and start by explaining "here's how you build a main function that does these things" and only come back to the type details later.

evancz commented 9 years ago

I can imagine all that, but I think we have to figure out the interaction with ports. To be concrete, you have streams hooked up to ports for printing and talking to dropbox. How can we pipe those streams to ports in a world with main : Task x (Signal Html) as the only way of running tasks?

main =
  async
    let mb = await mailbox
    in
      Signal.map (view mb.address) (Stream.fold update model mb.stream)

port print : Stream ()
port print =
    ???

Maybe outbound ports are not a thing anymore? And if you want to use something, you have to wrap it up in some FFI as a task?

print : Task x () -- some kind of FFI to define this

main =
  async
    let
      mb = await mailbox
      model = Stream.fold update initialModel mb.stream
      _ = await Stream.spawnListener (always print) (Stream.filterMap needsPrint (Signal.toStream model))
    in
      Signal.map (view mb.address) ()

-- Stream.spawnListener : (a -> Task x b) -> Stream a -> Task y (Stream (Result x b))
-- needsPrint : Model -> Bool
-- Signal.toStream : Signal a -> Stream a
rtfeldman commented 9 years ago

A really simple Task-based FFI that kept the current JS side of things the same as it currently is could just be Task.emit, which "emits an event" to JS:

emit : String -> a -> Task x ()

Then on the JS side maybe you do app.on("foo", handler) instead of app.ports.foo.subscribe(handler), like so:

(the Elm side)

doc = ... -- assume this is a document to be persisted; its type could be anything

downloadDocTask : Task x ()
downloadDocTask = Task.emit "download" doc -- assume this Task ends up getting run

(the JS side)

app.on("download", function(doc) {
  // persist the doc to Dropbox...
});

This is certainly a metaphor used all over the place in JS, so I'd expect it to be very easy to pick up.

TheSeamau5 commented 9 years ago

I think that while the idea of having main be a task, as shown by @Apanatshka, is interesting, the idea of having to introduce the concept so soon to a beginner is worth thinking through.

Let's analyze this concern though by looking at different languages that take this route:

Haskell:

main is of type IO a. This forces early introduction of monads.

main :: IO ()
main = print "Hello world"

Java:

Java forces everything into a class and main must be a static method. This forces early introduction of classes and methods.

public class HelloWorld{
  public static void main(String[] args){
    System.out.println("Hello world");
  }
}

Note that from an architectural point of view, there's nothing inherently wrong with either way of doing things (modulo your favorite arguments in OOP vs FP).

In Java, it is a good idea to put things into seperate classes. Having beginners understand how classes work is very important and the earlier one learns about classes, the better the beginner will be at programming Java.

In Haskell, it is a good idea to work with monads. Monads are the main means of abstraction in Haskell and most interesting types in Haskell are either themselves monads or deal frequently with monads. Again, the earlier a beginner can understand monads, the better.

On the other extreme we have languages that ditch main entirely.

Python:

print "Hello world"

Javascript:

console.log("Hello world");

By not even having main both of these languages can more easily focus on the very basic concepts in programming (like, variables, functions, types, etc...) before ever dealing with the harder stuff like classes or monads.

Yet, they both have the problem that as soon as you want to write a program that's slightly more interesting than Hello world, you end up wishing for a main. It is often a good practice to have just one function/global called main somewhere and then have it be the only point of entry for the program.

Interestingly, Elm has placed itself early on in a sweet spot in the middle where main is of type Element. This makes it easy to introduce the simple concepts and then when convenient say that "oh, main can also take Signal Element". At the same time, this approach does not compromise on the whole architecture thing. On the contrary, it's one of those little things that push you towards seperating your concerns.

So, if we suppose that we have a super-beginner friendliness scale where Java and Haskell are on one end and Javascript and Python are on the other, where is Elm now and where would Elm be if main is of type Task? (by "super-beginner friendliness", I mean, first impressions...especially first impressions from non-experts)

And then, the complementary question is where does Elm stand in an architecture scale? Like, how well does it encourage good coding practices early on? And where would it stand after making the change of main to Task? (You know, how natural would something like the Elm architecture be or will people have a hard time thinking of something like this and would be too tempted to write sloppy code)

evancz commented 9 years ago

@rtfeldman, the idea is cool, but it has a few weaknesses when compared to ports:

It looks very simple, so it'd be cool if we can address these two points. Spiros suggested making the FFI like this:

foreign <id> <str> : τ

Which I think is pretty typical for an FFI, where you just need to makes sure the string name matches up with the type τ or things mess up.

rtfeldman commented 9 years ago

Personally I think the Haskell snippet exemplifies a missed opportunity. Consider:

print "Hello world"
console.log("Hello world");
main = print "Hello world"
public class HelloWorld{
  public static void main(String[] args){
    System.out.println("Hello world");
  }
}

These are the minimal ways to implement Hello World in those languages.

I think the assumption that beginners benefit from type signatures starting at minute zero is almost certainly incorrect, given that languages that do not even allow for type signatures are widely considered more beginner-friendly.

In a world where you introduce beginners to Elm by showing main sans type signature, how much does it matter that its type involves a Task? My guess is that it becomes essentially insignificant in that world - at least up to the point at which you actually want to teach the learner how to use Tasks, of course!

evancz commented 9 years ago

@TheSeamau5, that's a very cool way of framing this! One point between the two extremes of "anything can happen anywhere" and "everything happens in main" is "everything happens in the main module" With the async/await syntax, that might mean writing code like the following:

mb : Mailbox Action
mb = await mailbox

model = Stream.fold update initialModel mb.stream

main : Signal Html
main =
  Signal.map (view mb.address) model

port print : Stream ()
port print =
  Stream.filterMap needsPrint (Stream.fromSignal model)

I think it is sort of weird to have await just floating around, but the code looks great and the learning curve is nice.

Maybe it makes sense to syntactically mark modules that can have port and await in them. They are theoretically different, so maybe it makes sense to annotate that. Like, maybe anything with a module declaration at the top cannot have effects? The main module is the only one with no name?

rtfeldman commented 9 years ago

@evancz I hadn't considered those two points, but that foreign example seems good!

evancz commented 9 years ago

@rtfeldman, I think you are being too kind to Haskell. Very quickly you need to ask "What is this IO thing? How do I do 2 of them?" and you are dragged into some of the most complex parts of the language. I think @TheSeamau5 is justified putting it in the same realm as Java in that regard.

rtfeldman commented 9 years ago

That's fair, but the point I really want to make is that if you want to introduce beginners to Signal and rendering, and let them get pretty deep into that before you introduce them to Task, just omit the type signature from main and come back to it once you think they're ready.

Given dynamic languages' pervasive lack of type signatures, yet reputation for beginner-friendliness, there's a case to be made for not introducing type signatures until later on in a newcomer's learning process regardless. :smiley:

rtfeldman commented 9 years ago

Here's a case for the "no syntax" and "all tasks through main" pairing:

How easy is it to learn?

Almost maximally easy as far as Tasks themselves go. Once you know what a Task is and how andThen/catch work, all you need to do is funnel them into main. There's no new syntax to learn (unless Ports change, in which case only people familiar with the current Port syntax would need to learn new syntax; newbies would not).

It makes the type signature for main more involved, but this is only a significant impediment to newcomer learning if they are taught main's type signatures before they are taught Tasks--and it's not clear that this is the optimal order in which to introduce those concepts.

Is it inviting to people who don't care about functional programming?

Yes, although realistically less so than other proposals.

What's inviting about it is the baseline appeal of Elm's Tasks to a JavaScript programmer: that you don't have to juggle synchronous APIs, callback-based asynchronous APIs, and Promise-based asynchronous APIs; instead, everything just uses Tasks and you never have to waste energy converting between the different styles.

Other proposals make Tasks even more inviting through the use of sugar, but this proposal does not preclude adding sugar in a later release.

Can it grow to support things like Haxl?

Yes, because it introduces no new syntax; additions that would make something like Haxl nicer remains on the table and unaffected.

TheSeamau5 commented 9 years ago

@rtfeldman I agree that dynamic languages' tendency to lack type signatures is beginner-friendly, but I don't know that it's true that having type signatures around suddenly makes you less beginner-friendly. My concern is really that with Task, there's suddenly some additional plumbing involved. I should've used the do notation in the Haskell example for added clarity.

@evancz It's interesting to allow for top-level operations in a main module. It would allow use to think of an Elm application as just a box with multiple in/out ports. You know, like in electronics and actual Signal processing. My question, which may sound counter-intuitive, is: wouldn't this be too restrictive? Allowing for top-level stuff only at the main module could potentially be slightly confusing... Like, you can use variables (I mean constants) anywhere. You can use functions anywhere. Same for types, records, imports, and everything else in the language anywere... But somehow, this one file is special? Maybe it's just me, but I find a single function where everything starts to be an easier pill to swallow than a single file where everything starts... I'm probably wrong on this, but I thought I'd share...

evancz commented 9 years ago

I added a new proposal for "how do we do async/await for Haxl and such?" but I don't think it's for 0.15.

That inspired this idea of how to make a module do some stuff. I think this will be easier to learn for people than "everything goes through main".

TheSeamau5 commented 9 years ago

I like the idea of async modules, but it begs the question:

Typically, in Javascript, you bring in a script asynchronously because you want it to do something later on... not right now. As such, you'd like that module to actually perform some effects. But this would be contrary to the idea of "all effects are in the main module".

Why not instead opt-in for a distinction between main modules and library modules?

From main modules you can import both library and main modules. From library modules you may only import library modules. Communication between main modules can only happen through ports.

I think this perspective could inform how we deal with Task because we would consider main as like a primary output port and everything else is like secondary output ports.

Apanatshka commented 9 years ago

Introducing tasks too soon

For those who are worried about needing to introduce tasks too soon when you only allow tasks through main: you don't! Simply allow main to still have types like Element and Signal Element etc. Then Task only comes up later when you need a Mailbox.

@task and do

I like the flexibility of async/await, though the keywords in the middle of expressions looks kinda freaky. I like the "@task and do" proposal best because it's more general. It doesn't look weird when using @maybe. But that's for a later time.

The problem with top-level execution

Given the new proposals with top-level task execution, maybe I need to clear up why I'm so much against this. The problem I have with it is normal modules have only pure definitions so those can go in any order. Once you add executing tasks, suddenly you need an ordering. But you're mixing in pure definitions too, which could be based on the results of certain tasks! Madness ensues, or you need keep to a strict linear ordering for both executing tasks and definitions. The former is of course awful, but the latter is also no good. I do recognise that top-level execution gives you nicer looking code. I have an idea for that!

New idea: executable section

What if instead of have a section with the comment -- WIRING, we have an actual syntactic section for it. And you're allowed to execute tasks there. There is a linear ordering where a name can't be used before it's definition, so evaluation order isn't crazy. We advise people to keep this section very small. You can have port declarations in there. Sorry for the rambling. Here's a demonstration of this executable section idea on the flickr example (see the bottom of the file). @evancz can you add this to the list in the top post?

rtfeldman commented 9 years ago

A similar idea would be to just add a second "special named function" like main - for example:

-- Describes your running program.
main : Signal Html

-- Describes a Task that gets run on startup.
initialize : Task x a

Then you could compose Tasks together as normal (including creating mailboxes etc) inside initialize, and they would get run simply by virtue of the fact that you called that function initialize and put it in the main module.

main itself could then reference initialize and incorporate it into the Signal Html however it pleases.

Apanatshka commented 9 years ago

@rtfeldman You'd have to have initialize : Result x a to be able to use it in main without that becoming a Task too, but sure, that's possible.

evancz commented 9 years ago

I'm kind of into @task/await as a pairing. That way we can have @task module and later @haxl. I don't know what that'd mean for @maybe or @result down the line, but I think await has better learning characteristics in the short term and is worth it based on that.

On the topic of ordering, first let's first talk about how it works in a normal let. We build up a graph where each node is a let-bound variable x and it has edges to all variables y that appear in its body. We then run an algorithm to find the strongly-connected components. This is sort of like topological sort, but it lets you have globs of mutually dependent nodes. This sorts out the ordering.

Obviously it'd be bad to reorder await because that'd change the meaning of a program!

So let's consider how this might work in the let! proposal:

  1. You cannot reorder := definitions. You can reorder = definitions as long as they do not cross a := definition.
  2. You cannot reorder := definitions. You can reorder = definitions however you want.

I would go with 2 if I did not think too hard about this, but I think both are plausible. Now let's consider how this would work with the async/await proposal.

  1. Recognize all definitions with an await in them. These cannot be reordered. Plain definitions can be reordered, but cannot cross an await definition. (same as 1 for let!)
  2. You cannot reorder await expressions. When building the dependency graph, we also add edges from all definitions with an await in them, to all subsequent definitions with an await. This will maintain the ordering of the await and reveal any globules of mutually dependent tasks which would not work and should be ruled out. I believe this is practically the same as 2 for let!

Rough Comparison

If choice 1 is chosen, you would have to put your mailboxes at the top of the main module. (I mean, above any uses. Assuming things are properly parameterized and modularized, there is probably only wiring in the main module and this is probably not a huge deal.)

If choice 2 is chosen, people can write code exactly as @Apanatshka described, but without any special syntax.

I think we need to decide on this in the non-top-level case, so let's figure that out, then see if it's fine for top-level as well. Maybe we can create some examples where things go odd with 1 or 2? Maybe it's always fine? I guess do in Haskell makes ordering rules much more strict?

rtfeldman commented 9 years ago

I'm slowly starting to get the feeling that async would do more harm than good. I think it's a bit more newbie-friendly than andThen, but not a ton more (even if people start getting used to it in other languages), and now I'm starting to wonder if newbies will be hurt more by recurring oops-ordering-actually-mattered-here bugs than by having to acclimate to a less familiar syntax.

One of the things I really appreciate about Elm is that its rules may be different than what I'm used to from other front-end languages, but once I understand those rules, they are very simple. It becomes hard to mess things up once you get into the swing of things.

By changing one of the most widely-used invariants in Elm (that ordering doesn't matter inside let bindings), await carries the implicit downside of making it easier to mess things up when writing Elm programs than (for example) just using andThen, where everything just follows normal expression rules.

I'm starting to think the juice isn't worth the squeeze.

evancz commented 9 years ago

Can you think of a concrete example? Would you make the same argument about do notation? Are the answers to these the same for options 1 and 2?

evancz commented 9 years ago

My very smart, but non-programmer friend just made an excellent argument against async/await and any variation on it. Consider the following two snippets:

async
  let
    hi = print "hello"
    _ = (await hi, await hi)
  in
    ...
async
  let
    hi = await print "hello"
    _ = (hi, hi)
  in
    ...

The first program prints hello twice, the second one once. The very simple rule of "you can always substitute an expression in" breaks and my friend sensed it was weird and did not like it. She was pretty into tasklet partly because the word itself is endearing, but also because it seemed shorter to say := and didn't have this substitution issue.

rtfeldman commented 9 years ago

Yikes - I didn't even think of that case. Yeah, that's another nail.

Of the six alternatives mentioned in the OP, three are variations on async/await. Another is let! - and I'm definitely on board with that being too big a can of worms to scope creep into 0.15.

That means by process of elimination I'm down to favoring the tasklet and "no syntax" options. I'm leaning towards "no syntax" primarily because the philosophy of "solve problems using existing tools whenever possible" has served Elm well so far as a language.

evancz commented 9 years ago

I agree about the plausible options. I also think "no syntax" has two serious issues:

  1. It does not work with ports, we need to redesign that feature to some sort of FFI. This means the idea of ports is dead and the idea of "go handle it on the outside" is much less explicit. Maybe that's good in the long run, but it seems like a lot for right now. Also, I was not planning to revisit the "how do you make JS bindings in the package repo?" question now, but this would effectively make that a requirement for 0.15.
  2. It feels and looks crappy to me, and I really don't think it's going to be great for people learning. Anyone making a website is going to need a mailbox early on, and they are going to have some task stuff happening. Suddenly they see backtick infix functions, andThen, anonymous functions, fancy types (and if no one is talking about types, they'll see crazy type errors which seems worse)

In a world with tasklet I think folks we can frame it as "like JS async/await but less wordy" so it's a obvious syntax feature that improves on JS. People can say "hey, look how simple my async code is" and show something somewhat familiar looking. I think that's a big deal.

I think we all sort of know each other's perspectives at this point, so my goal now is going to be showing people not in this discussion some examples in different styles to choose from and see what their feedback is.

Apanatshka commented 9 years ago
  1. There is no reason for "no syntax" not to work with ports! Given the latest Elm 0.15 draft, Ports are either a stream coming in or an address for something going out. So to take the example from your message:

    andThen_ a b = a `andThen` \_ -> b
    
    main =
    mailbox `andThen` \mb ->
     Stream.subscribe (Port.send print.address) mb.stream `andThen_`
     Signal.map (view mb.address) (Stream.fold update model mb.stream)
    
    port print : OutboundPort String
  2. You don't necessarily need backtick infix functions, but I think it's less magical than (>>=), don't you? Since when are anonymous functions super confusing? We're targeting web devs right? JavaScript has lots of anonymous functions, especially for callbacks. The fancy types come from tasks, and they will need tasks for a mailbox, so that's not going to change.

    Anyway, if you're worried this looks too ugly, then why not an executable section? It's minimal syntax changes, no confusion about ordering, and top-level task execution.

Another problem with tasklet syntax is that the rules around types get complicated. Once you've really won people over and they start using types to reason about their code (one of the advantages of pure functional programming), this syntax is in the way to understanding what's going on.

If you're going to show examples, please show the best code style you can think of within the "no syntax" option. I know you wouldn't consciously misrepresent it, but confirmation bias is always lurking.

evancz commented 9 years ago

Ah, good points on number 1. My follow up question would be, how do we send static values across? I know I need that. Maybe a port that is not a Outbound or Inbound type?

Do you envision a world in which there is never any special syntax for chaining for anything?

evancz commented 9 years ago

Here is the flickr example in the two styles. I'd like to get one that interacts with ports as well.

TheSeamau5 commented 9 years ago

So, I was looking into the @task syntax and I'm kinda starting to really prefer it to tasklet or nosyntax and certainly more than async-await.

I put together a few simple examples to show how Elm code might look under this regime (obviously, if we're in macro-world then you can't really use := outside the macro, which rules out some of the top-level ideas)

-- Blur an image
main = @task 
  let
    image := getImage "image.jpeg"
    blurredImage = blur image
  in
    display image

-- Play a song
main = @task 
  let
    song := getSong "song.mp3"
  in
    play song

-- Play a fullscreen video
main = @task 
  let
    video := getVideo "video.mp4"
    _ := requestFullScreen
  in
    play video

-- Race images from web
main = @task 
  let
    catPic := race
      [ getCatPicFromFlickr
      , getCatPicFromGoogle
      , getCatPicFromTumblr
      , getCatPicFromReddit
      ]
  in
    display catPic

I didn't include the types, but you can imagine what the types are more or less for each function.

Apanatshka commented 9 years ago

Ah, good points on number 1. My follow up question would be, how do we send static values across? I know I need that. Maybe a port that is not a Outbound or Inbound type?

Yes, using port staticValue : Int should work well enough. Seems like a simple and straight-forward solution :)

Do you envision a world in which there is never any special syntax for chaining for anything?

At least for now andThen seems to work well enough. It also allows you to use types to reason about things. I can imagine adding some kind of chaining syntax at some point, because reading a name bind like a := taskB is slightly nicer than taskBandThen\a ->. At the same time I think something like idiom brackets would be more generally useful, that's why I like this @task macro thing, which seems to unify the two. Of course that can be easily abused to make code hard to read, so I'm not sure if it will ever be a good idea to introduce.

Here is the flickr example in the two styles. I'd like to get one that interacts with ports as well.

These look great as a side-by-side comparison. I love this part of the no-syntax getImage, because it reads great and tells me on a high level what getImage does:

 getPhotoList
  `andThen` choosePhoto
  `andThen` getSizeList
  `andThen` chooseSize 

You could even lift some of these tasks to the top-level.

The wiring may be too big a blob, but otherwise it's fine. I'd split it up like you did with getImage:

getResults : Mailbox String -> Task x (Mailbox String, Stream (Result Http.Error String))
getResults queryMailbox =
  Task.runStream (Stream.sample getImage Window.dimensions queryMailbox.stream)
  `andThen` \results -> Task.succeed (queryMailbox, results)

renderView : (Mailbox String, Stream (Result Http.Error String)) -> Signal Html
renderView (queryMailbox, results) =
  Task.succeed <|
    Signal.map3 (view queryMailbox.address)
      Window.height
      (Stream.toSignal "" queryMailbox.stream)
      (Stream.toSignal "waiting.gif" <| Stream.filterMap Result.toMaybe results)

main : Task x (Signal Html)
main =
  Stream.mailbox
    `andThen` getResults
    `andThen` renderView

I think the wiring in the tasklet version is definitely more readable, simply for being less text. But I'd feel a lot more comfortable with top-level task execution if it's in a special executable section. And it's not like the no-syntax version is abysmal, or that's my opinion at least :)

getImage in the tasklet version is not very different from the one in the no-syntax version. Here it's shorter but actually less readable because you need to read the full let in sequential order to even understand what's going on. The no-syntax version is nicer here, because it forces you to be more modular. And in this case it doesn't bloat the code, which is a minor issue with the wiring part.

evancz commented 8 years ago

Thanks for working through this everyone! I have ideas for a nice design. I don't think it makes sense to just keep this issue open for an arbitrarily long time, so I will close. This is on my personal priority queue though.