racket / rhombus

Rhombus programming language
Other
355 stars 63 forks source link

Make an RFC for multiple values #78

Closed sorawee closed 4 months ago

sorawee commented 5 years ago

Do we need multiple values? If so, what's the best way to make it integrate well with the language? Right now, it's not.

As an example, if I have a function f that returns two values, and I want only the first returned value to get passed to function g, the best I can do is:

(let-values ([(x _) (f foo bar)])
  (g x))

Is there anything better that we can do?

AlexKnauth commented 5 years ago

Can we have tuples instead?

sorawee commented 5 years ago

Yeah, that's the idea. Multiple values, as I understand, is only needed for performance (https://www.cs.indiana.edu/~dyb/pubs/mrvs.pdf). But if we are going to have some kind of static analysis (see #57), it might be possible to make returning a tuple as efficient as multiple values.

sorawee commented 5 years ago

How does Haskell deal with returning tuples?

jeapostrophe commented 5 years ago

My personal taste is that multiple values are theoretically beautiful and elegantly expose the ability of an advanced compiler to use a custom calling convention, but are extremely annoying to actually use in almost all circumstances in Scheme & Racket, for the reasons you mention.

rocketnia commented 5 years ago

I think Racket 2 would be simpler to use if it didn't require anyone to know about multiple-value return. It could still exist if it's needed for performance, but it can be relegated to a specialty thing.

Thinking about it in more of a positive brainstorming way, as in "Is there anything better that we can do?" I have several ideas.

To start with, procedures could have the same output convention as their input convention:

To make this easier to work with, "multiple values (with possible keyword arguments)" (which we might dub a "values-and-keywords collection") could be the primary first-class value type of the language. Variables would abstract over that concept the same way expressions do:

; Bind `div-results` to `(values 5 4)`.
(define-values div-results (quotient/remainder 124 24))

; Bind `q` to `5` and `r` to `4`.
(define-values (q r) div-results)

With this infrastructure in place, some quality-of-life conveniences would suggest themselves:

I have no idea what effect any of this would have on performance, nor how severe everyone's headaches would be as they tried to maintain each other's clever nested values-and-keywords data structures. :-p So I don't necessarily support it, but it's a fun thought experiment.

jeapostrophe commented 5 years ago

Check out this code from over nine years (don't actually know when I wrote it because I had this code outside of git back then): https://github.com/jeapostrophe/exp/blob/master/values.ss (plus the racket update: https://github.com/jeapostrophe/exp/blob/master/values.rkt ) --- It implements a bunch of these ideas, although not as consistently as you suggest.

I don't really support this idea either, but it is useful to write down and think about.

rocketnia commented 5 years ago

Whoa, nice! :)

jackfirth commented 5 years ago

I think there's another cost to supporting multiple values: it encourages people to group values together in a semantics-free way, instead of creating a struct type with named fields that describe what the values are actually for. Semantics-free representations are great for generic code! They're terrible for everything else, since non-generic code tends to have, y'know, actual semantics associated with it.

dedbox commented 5 years ago

Do we need multiple values? If so, what's the best way to make it integrate well with the language? Right now, it's not.

Whoa, let's slow down a little.

Do we need multiple return values? Not to the extent that Racket wouldn't be Turing-complete if we removed them, but they are so convenient I consider them indispensable.

I'll outline just one way multiple return values make my life easier, but it's a big one.

Racket conceptualizes function invocation as the application of a procedure to an argument list. Since lists can have any number of elements, this perspective leads naturally to multivariate functions—such as thunks or functions with rest-args—and multivariate functions compose neatly with multiple return values.

There are useful symmetries between values and list; and between compose, curry, and apply. In combination, they are a potent DSL for multivariate function composition. It's no coincidence Algebraic Racket's prelude shortens most of these names to one or two characters.

Take the totem from my RacketCon talk slides, which looks something like this in Racket:

(compose (curry apply values) … list)

This pattern crops up everywhere when I'm using a compositional style. Algebraic Racket is designed around these functions, to support function composition as a deliberate programming style. My code is so much cleaner now, I'd need a good reason to go back.

Here's a real example, in the event monad:

(define (async-values-evt . evts)
  (if (null? evts)
      (return)
      (do let identified (φ e (fmap (>> :: e) e))
          ((e . x)) <- ($ choice-evt (map identified evts))
          xs <- ($ async-values-evt (remq e evts))
          ($ return (:: x xs)))))

A direct translation to Racket would look like this:

(define (async-values-evt . evts)
  (if (null? evts)
      (handle-evt always-evt (λ _ (values)))
      (let ([identified (λ (e) (handle-evt e (curry cons e)))])
        (replace-evt
         (apply choice-evt (map identified evts))
         (match-lambda
           [(cons e x)
            (replace-evt
             (apply async-values-evt (remq e evts))
             (λ xs (apply (λ ys (handle-evt always-evt (λ _ (apply values ys))))
                          (cons x xs))))])))))

Every line examines, generates, or brings together multiple arguments and multiple return values. Several lines do more than one of these things. Every (λ xs ...), for example, captures an indeterminate number of return values and then applys a function to them as its argument list. Each instance can be eliminated by composeing a curryed version of its body with list, and any list can be converted into multiple values by applying the values function to it.

More precisely, (curry apply values) and (curryr call-with-values list) are duals with respect to multivariate function composition, provided the multi-valued components of both can be wrapped in thunks.

As an example, if I have a function f that returns two values, and I want only the first returned value to get passed to function g, the best I can do is:

(let-values ([(x _) (f foo bar)])
  (g x))

Is there anything better that we can do?

For an expression this simple, let-values might be the best choice. For more complex expressions, here are a few other ways I might do it in Racket without going bananas:

(call-with-values (λ () (f foo bar)) (compose g car list))

(g ((compose car list (λ () (f foo bar)))))

Depending on the context, either of these might be "better" for some definition of "good."

Algebraic Racket makes this sort of multi-valued wizardry mundane, so I'd usually try a few and pick the simplest one. Directly translating the last two examples, I get:

(values-> (.. g car list) (f foo bar))

(g ((.. car list (>> f foo bar))))

And if we include the values monad, I have a few more options:

(fmap (.. g car list) (λ () (f foo bar)))

(>>= (λ () (f foo bar)) (.. g car list))

(lazy-do (x . _) <- (f foo bar)
         (thunk<- (g x)))
rocketnia commented 5 years ago

I'm glad you chimed in, @dedbox! It's good to hear from someone seriously using multiple-value return for point-free composition.

Based on your refactorings of this example, it looks like it could even be written ((compose g car list f) foo bar).

That said, in a language without multiple-value return, the design of f would be different, this example might simplify to (g (car (f foo bar))) (or even to use something with a more meaningful name than car), and I imagine you would still find it possible to use an arbitrary Racket procedure f in a point-free style by wrapping it up as (>> $ f)/(curry apply f). Are you thinking this wouldn't be as convenient as the techniques you're using now?

sorawee commented 5 years ago

@dedbox

I concur with @rocketnia. In particular, every time you use list, you effectively switch from multiple values world into single value world, precisely because it's difficult to manipulate multiple values, and at that point, there's really no benefit in using values for performance.

Here's one way to translate code with values feature to one without it. Redefine values with list. Split call-with-values into two versions:

  1. (call-with-values-1 gen recv): operate on a function that returns "bare" value. Equivalent to (recv (gen)).

  2. (call-with-values* gen recv): operate on a function that returns multiple values (that is, list). Equivalent to (apply recv (gen)).

(also, if we wish, gen doesn't need to be thunk-ed anymore)

To lift from "bare" value to multiple values, simply apply values. To lift a function that returns a bare value to a function that returns multiple values, apply lift: (define (lift f) (lambda xs (values (apply f xs)))).

Assuming that compose works on functions that return multiple values. Here's one possible implementation: (define (compose f g) (lambda xs (apply f (apply g xs)))).

compose1 stays the same: (define (compose1 f g) (lambda (x) (f (g x)))).

The only thing that we will lose is the "implicit one value" ability. E.g., we won't be able to write (define (f x) (if x 1 (values 2 3))) anymore. Instead, we need to write (define (f x) (if x (values 1) (values 2 3))).

dedbox commented 5 years ago

@rocketnia,

Based on your refactorings of this example, it looks like it could even be written ((compose g car list f) foo bar).

Nice. It works!

That said, in a language without multiple-value return, the design of f would be different, this example might simplify to (g (car (f foo bar))) (or even to use something with a more meaningful name than car)

I don't follow. How does Racket know (f foo bar) is equivalent to (apply f (list foo bar)), and (g (car (f foo bar))) is equivalent to (apply g (list (car (f foo bar)))), but (car (f foo bar)) is equivalent to (apply car (f foo bar)) and not (apply car (list (f foo bar)))? What makes car special?

and I imagine you would still find it possible to use an arbitrary Racket procedure f in a point-free style by wrapping it up as (>> $ f)/(curry apply f).

Not a die-hard fan of point-free style, but I'm happy to use it when it gives good results.

Are you thinking this wouldn't be as convenient as the techniques you're using now?

I am pathologically pragmatic, so my techniques will adapt to whatever reality we find ourselves in. There are always trade-offs, so I'll delay speculating until I have a more concrete understanding of what you're proposing.

@sorawee,

every time you use list, you effectively switch from multiple values world into single value world

No, not really. Not in any meaningful sense.

precisely because it's difficult to manipulate multiple values,

Definitely not. It's usually because I want to map, filter, fold, or apply something across an argument list, as opposed to just one of its elements. It is an artifact of multiple arguments, not multiple return values. Would you feel different if we replaced list with + or void?

And besides, capturing multiple return values is as easy as (λ args ...), wherein I have the full force of Racket's list processing capabilities at my disposal. Generalizing args to a full-blown pattern makes the experience even better.

Here's one way to translate code with values feature to one without it. Redefine values with list. Split call-with-values into two versions:

1. `(call-with-values-1 gen recv)`: operate on a function that returns "bare" value. Equivalent to `(recv (gen))`.

2. `(call-with-values* gen recv)`: operate on a function that returns multiple values (that is, list). Equivalent to `(apply recv (gen))`.

I don't understand where you're going with this. The most prominent feature of multiple return values is that you can largely ignore them when you don't need them. If you're proposing every function have exactly one argument, you can already do that without any hardcore macro-fu. Go make a #lang univariate/racket/base for this perspective and tell me how it goes.

(also, if we wish, gen doesn't need to be thunk-ed anymore)

Not without some implicit notion of laziness, or am I missing something? If gen isn't a thunk, then what is it?

rocketnia commented 5 years ago

@dedbox,

That said, in a language without multiple-value return, the design of f would be different, this example might simplify to (g (car (f foo bar))) (or even to use something with a more meaningful name than car)

I don't follow. [...] What makes car special?

What I'm saying is that in a language without multiple-value return, whoever authored f couldn't have it return multiple values in the first place, so one likely possibility is that they'd have it return a list instead. That way we could just pass the list to car.

If it returned something more meaningful than a list (such as a struct, as @jackfirth is talking about), we could pass it to a more meaningful function than car.

dedbox commented 5 years ago

What I'm saying is that in a language without multiple-value return, whoever authored f couldn't have it return multiple values in the first place, so one likely possibility is that they 'd have it return a list instead. That way we could just pass the list to car.

But this doesn't eliminate the burden of discerning multiple arguments from a lone argument that happens to be a list. It merely shifts the burden away from the platform and toward the programmer in the form of ad hoc calling conventions and extra ``glue'' syntax.

One way to build a reference interpreter is to start with a single-argument lambda calculus with pairs, then use it to bootstrap into multiple arguments and more structured data. The tutorial series included with Algebraic Racket documents this technique. It even covers a syntax extension that exploits the ambiguity between list arguments and argument lists to provide first class pattern-based macros with the bare minimum parentheses.

All that is to say I have designed and implemented, on top of Racket, many languages with the sort of syntax you're proposing, and it isn't all roses. The devil is always in the details.

If it returned something more meaningful than a list (such as a struct, as @jackfirth is talking about), we could pass it to a more meaningful function than car.

While I'm generally sympathetic to Jack's desire to name all the things, I'm not buying the argument that structs with named fields are inherently more meaningful than lists. I mean, you can already do this today—with calling conventions and extra ``glue'' syntax. There's nothing stopping you from implementing a whole parallel universe where every function accepts and returns, say, a single hash. I studied a spectrum of variations on this theme throughout my CS Masters program. In practice, I think you'll find not everything needs a name, and too many names can be just as disorienting as too few.

sorawee commented 5 years ago

Here's an optimization in Haskell:

If a function returns a tuple of values, and the caller immediately takes the tuple apart again, GHC will attempt to eliminate the tuple completely at the machine code level. Actually, this works for all types having exactly 1 constructor.

-- https://wiki.haskell.org/GHC_optimisations

tgbugs commented 5 years ago

General principals question.

If the underlying runtime supports multiple values, or any language feature for that matter, should a language built on top of it have a construct that allows one to make use of that functionality?

If the answer to this question is no, then don't we still have to require that all compliant runtimes support that feature anyway due to of the requirement for interoperatilbity with existing Racket?

jackfirth commented 5 years ago

If the underlying runtime supports multiple values, or any language feature for that matter, should a language built on top of it have a construct that allows one to make use of that functionality?

I don't think so, no. A language may do so, but I don't see any reason a language should or must.

If the answer to this question is no, then don't we still have to require that all compliant runtimes support that feature anyway due to of the requirement for interoperatilbity with existing Racket?

Yes. But for the most part, Racket2 RFCs aren't about the runtime. They're about the surface APIs. At least, that's been my assumption so far.

mflatt commented 4 months ago

527