disco-lang / disco

Functional teaching language for use in a discrete mathematics course
Other
163 stars 23 forks source link

User-defined infix operators #136

Open byorgey opened 6 years ago

byorgey commented 6 years ago

It would be really cool to allow users to define infix operators (especially in terms of moving stuff into the standard library). While we're at it we might allow postfix operators too. There are a number of tricky things to decide here but in the end it should be doable. Here's my current thinking.

First, we need some syntax to denote when a name is supposed to be an infix (postfix) operator, and so it can be used independently (i.e. so it can be partially applied, or so one can ask for its type using :type, etc.). The best idea I have currently is to do what Agda does and use underscores for this purpose. That is, disallow underscores in names, and use something like _foo_ to denote the name of an infix operator foo. The idea is that the underscores stand in place of the arguments.

Next, there needs to be a way to declare the associativity and precedence level of an infix operator. Haskell allows such "fixity" declarations to occur anywhere in a file, but this makes parsing tricky. To make this work Haskell requires infix operators and prefix functions to be syntactically distinguished (operators use only operator symbols, prefix functions use alphanumerics). That way, a module can be parsed using a "default" fixity for any unrecognized operators, and then once all the fixity declarations have been collected, a second pass goes back through and fixes up any expressions to reflect the true fixities.

However, we have already gone down the path of allowing alphanumeric names as infix operators (e.g. mod, choose) and I think that fits best with the design philosophy of the language. So allowing fixity declarations throughout a module is not really an option (unless we disallowed infix operators from being used at all in the same module where they are defined, but this seems terrible). I think a good compromise is to require all fixity declarations for a module to come first (after any imports but before any other declarations). Then after parsing them we can dynamically construct an expression parser and use it to parse the rest of the module.

What should a fixity declaration look like? Haskell has infix, infixl, or infixr to specify no, left, or right associativity, followed by a number from 0-9 to indicate precedence level. However, we already have a lot more than 9 precedence levels, and they aren't numbered in any particular way (the Syntax.Operators module assigns them numbers arbitrarily based on their order in the table, just to be able to compare levels easily). I propose a way to specify associativity, just like Haskell, and then a way to specify another operator for comparison and whether the new operator's precedence is lower, higher, or equal to the other operator. Then it will get inserted in the operator table appropriately.

For example, suppose we want to define an operator *! which performs matrix multiplication, and we want it to be left associative and have the same precedence level as normal multiplication. We might write something like

infix _*!_ same precedence as _*_, left associative

or maybe we want a more concise syntax, but that's the basic idea.

We probably want to refactor things to get rid of TBin, and simply have infix operator applications represented with TApp nodes. Built-in operators will be represented by prims (#119) and user-defined operators just as normal names. Pretty-printing will get a bit more complicated since we have to look a little deeper to see if we have an infix name applied to two arguments, but fundamentally it's not that hard. The benefit of getting rid of TBin is to make things more uniform: now infix operators can be treated as first-class functions in their own right, e.g. you could ask for the type of an operator, or store it in a list of functions, or pass it directly to a higher-order function (think of zipWith _+_ xs ys instead of zipWith (\x y. x + y) xs ys). A similar story goes for TUn.

byorgey commented 3 years ago

Latest thoughts on syntax: instead of having different keywords infix, prefix, etc., let's just have one keyword operator, with fixity specified by ~ characters. And instead of same precedence as etc. , we can just use <, =, or >.

operator ~!*~ = ~*~, left

The left or right part is optional.

byorgey commented 2 years ago

We also need to be able to parse operator definitions. Or perhaps we just require that an operator has to be declared to be a synonym for some named function. That would be much simpler. So maybe something like this:

operator ~++~ = concat, precedence of ~+~, right associative

This defines ~++~ to be a synonym for concat, and defines its precedence and associativity.

byorgey commented 2 years ago

See the user-ops branch: https://github.com/disco-lang/disco/tree/user-ops . Next steps:

byorgey commented 1 month ago

Note, it's been quite a while since I hacked on this, so I took a look at what would be required to merge main into the user-ops branch --- and it looks like it would be relatively straightforward.