carp-lang / Carp

A statically typed lisp, without a GC, for real-time applications.
Apache License 2.0
5.53k stars 178 forks source link

RFC: Allow variadic functions #1072

Open hellerve opened 3 years ago

hellerve commented 3 years ago

Lasciate ogne speranza, voi ch'intrate. Or: this RFC is pretty big, sorry about that.

What?

in which I try to explain what I’d like.

Many programming languages have some version of variadic functions, or functions with multiple arities. Often these do not play well with typing, but that is not necessarily the case. I would like a simple version of variadic functions: Multiple arities, each with their own body. It could looke like this, if we’d adopt it for defn proper:

(defn add
  [a] (inc a)
  [a b] (+ a b)
  [a b c] (+ (+ a b) c))

If we’d like to add a special syntactic construct, I would propose defnmulti, because it walks and quaks kind of like a multimethod:

(defnmulti add
  [a] (inc a)
  [a b] (+ a b)
  [a b c] (+ (+ a b) c))

I would advise against a special symbol, however, since it would be entirely backwards compatible with defn—having a single argument count and body would just be a special case: a multimethod with one version. This would avoid duplicate implementations etc.

For simplicity reasons, I would not enforce any constraints for overlapping types of multifunctions (i.e. the a argument having to have the same type in every version above), but I think it would be best practice to have all of them have the same types if possible.

This PR explicitly excludes things like rest arguments, keyword arguments, or optional arguments. These are related, but an entirely different beast, especially implementation-wise.

Henceforth I shall call this construct multifunctions, since they’re functional multimethods, kindasorta.

Why?

in which I explain why I’d like it.

Multifunctions are useful for functions that make sense with multiple numbers of arguments—see add above. They are also useful in contexts where there are default values for some arguments:

(defn get-or
  [m] (get-or m (zero))
  [m default] (get m default))

We see patterns like these crop up in the Map module, for instance, which I am alluding to above—though the internals of Map are slightly different.

How?

in which I explain how to accomplish what I’d like.

Implementing multifunctions is a matter of refactoring the existing Defn object type to allow multiple argument counts and bodies. Since the argument count is always unique, both unifying and emitting against the correct binder should just be a matter of looking up how many arguments are passed at the call site.

While the implementation seems more or less straightforward, it will have to touch a lot of parts of the compiler, and it will be a fairly big programming effort. As of now, I do not foresee any roadblocks ahead, though.


Any questions, comments, and any sort of feedback is welcome! Thanks for staying with me this entire time!

jacereda commented 3 years ago

Is that https://en.wikipedia.org/wiki/Ad_hoc_polymorphism ?

jacereda commented 3 years ago

What about something like:

(defnmulti add
  [a] (inc a))
(defnmulti add
  [a b] (+ a b))
(defnmulti add
  [a b c] (+ (+ a b) c))
hellerve commented 3 years ago

Is that https://en.wikipedia.org/wiki/Ad_hoc_polymorphism ?

Not quite. Ad-hoc polymorphism is closer to what we do with interfaces, since multifunctions aren’t really overloaded in the sense of polymorphism, if that makes sense.

hellerve commented 3 years ago

What about something like:

(defnmulti add
  [a] (inc a))
(defnmulti add
  [a b] (+ a b))
(defnmulti add
  [a b c] (+ (+ a b) c))

That would work as well, but this ties back into what I said above:

I would advise against a special symbol, however, since it would be entirely backwards compatible with defn—having a single argument count and body would just be a special case: a multimethod with one version. This would avoid duplicate implementations etc.

TimDeve commented 3 years ago

Would the multiple declarations make it easier to implements defaults by referring to itself?

(defnmulti get [url params]
  (curl url "GET" params))
(defnmulti get [url]
  (get url @{}))
eriksvedang commented 3 years ago

I think I like the idea...

It does discard any possibility of auto-currying our functions, right? (not sure we'd want that, but still...)

In any case I'd prefer if we don't implement this until a bunch of other things are more solid, since we already have quite a lot of bugs in regards to function calling / name resolution.

eriksvedang commented 3 years ago

(hopefully that wouldn't be too far in the future..!)

hellerve commented 3 years ago

Would the multiple declarations make it easier to implements defaults by referring to itself?

(defnmulti get [url params]
  (curl url "GET" params))
(defnmulti get [url]
  (get url @{}))

Yes, this is in fact very similar to the example I set in Why with get-or.

eriksvedang commented 3 years ago

Would the multiple declarations make it easier to implements defaults by referring to itself?

(defnmulti get [url params]
  (curl url "GET" params))
(defnmulti get [url]
  (get url @{}))

Yes, this is in fact very similar to the example I set in Why with get-or.

It's neat. Would it make sense to have default values too, or is that just another way to do the same thing?

jacereda commented 3 years ago

But wouldn't it be better to have multiple declarations, so you can augment behaviour in a separate file?

hellerve commented 3 years ago

It does discard any possibility of auto-currying our functions, right? (not sure we'd want that, but still...)

I think it would not necessarily, since auto-currying would require us to keep track of how many arguments have been applied to the function anyway, and we’d have to statically ensure that they are called correctly. I think the only thing that would change there would be from "make sure that this function receives 4 arguments when all is said and done" to "make sure that this function receives between 2 and 5 arguments when all is said and done".

In any case I'd prefer if we don't implement this until a bunch of other things are more solid, since we already have quite a lot of bugs in regards to function calling / name resolution.

Yes, I agree. This is also strictly an enhancement, so the timeline is very flexible.

hellerve commented 3 years ago

But wouldn't it be better to have multiple declarations, so you can augment behaviour in a separate file?

That’s an argument for a separate form indeed. Which of the two is more desirable is a matter of preference, I suppose, but in my mind making the system extensible wouldn’t necessarily make it more useful. It would, however, make the system behave more like interfaces, which might be desirable in its own right.

In the end, I just want a congruent system with as few primitives as possible.

hellerve commented 3 years ago

Alternatively, we could also use metainformation on symbols to take care of how we construct binders, which would take care of all of this. Then all of defn and defnmulti could be reduced to meta-set! calls.

We could use a dispatch property to save all of the possibilities as pairs of args and body, for instance:

; this is simplified pseudocode!
(defmacro defn [name args body]
  (meta-set! name "dispatch" (list args body)))

(defmacro defnmulti [name args body]
  (meta-set! name "dispatch" (cons (list args body) (meta name "dispatch"))))

This would also allow us to extend the system as we go without having to modify the compiler.

jacereda commented 3 years ago

This would also allow us to extend the system as we go without having to modify the compiler.

That's a direction worth exploring, applying a "provide mechanisms, not policy" approach in the haskell side.

scolsen commented 3 years ago

I’m trying to think of potential problems with this but I don’t think there are any! From a type systems perspective this shouldn’t change anything (as we’re still defining unique types, they can just be referenced by the same name)

I think the only real downside to this is that it will complicate lookups—we’ll now need to check the arglist length + dispatch meta key when resolving names.

We’ll also need to rework interface implementation code to use whichever variant best matches the implemented interface’s type.

sdilts commented 3 years ago

What about having default arguments instead? How about something like this?

(defn get-or [m (default (zero)]
   (get m default))

IMO, this would make things clearer, as you wouldn't have multiple definitions of the same functions running around.

I think this would be pretty easy to implement with the meta-set method of defining functions and a macro.

hellerve commented 3 years ago

I think that’s also a good idea; it seems like it would be strictly less powerful and a subset of functionality (since you can implement default arguments with variadic functions, but not vice versa). Is that correct?

sdilts commented 3 years ago

Are we talking about having functions that can have optional arguments (like &optional), or something equivalent to &rest args (and C's variadic functions)? Or is it something closer to function overloading? That sounds correct, but I'm confused on the terminology.

TimDeve commented 3 years ago

This PR is closer to overloading to me.

hellerve commented 3 years ago

@sdilts the closest terminology match I know is that of multi-arity functions in Clojure. It is less powerful than overloading, since it strictly dispatches on the number of arguments. The other half of the puzzle then are interfaces, which gives us multiple dispatch.

Efimero commented 3 years ago

From a user perspective, since Carp relies so heavily on enforcing type safeties, I feel like it would be reasonable to have a syntax that ensures you can trace back to the correct place. I think something like this would make the most sense to me.

(defn myfnA1 [x] (inc x))

(defn myfnA2 [x y] (+ x y))

(defn myfnA3 [x y z] (+ (+ x y) z))

(defmulti myfnM
  [myfnA1
   myfnA2
   myfnA3])

And have it error when the argument types don't match. Then it could also act like defmodule and allow for (defmulti myfnM [myfnA4]) later, just adding another dispatch. I think this would be the most ergonomic way, but borrowing the Clojure syntax (defn name ([x] ...) ([x y] ...) ([x y z] ...)) could work too. That's all. 🌹

hellerve commented 3 years ago

Hey @Efimero I generally like this idea; it precludes what @jacereda proposes, however, since it’s not extensible (whoever calls defmultireigns supreme). Is that correct?

Efimero commented 3 years ago

I am not sure how it would work, but it should be extensible as long as it doesn't add a conflict, the same way defmodule works.

hellerve commented 3 years ago

Mhm, so would it be possible in this case to add a branch after defmulti has been called? If so, how would that look?