JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.55k stars 5.47k forks source link

Custom infix operators #16985

Open jamesonquinn opened 8 years ago

jamesonquinn commented 8 years ago

There is a discussion at https://groups.google.com/forum/#!topic/julia-dev/FmvQ3Fj0hHs about creating a syntax for custom infix operators.

...

Edited to add note: @johnmyleswhite has pointed out that the comment thread below is an invitation to bikeshedding. Please refrain from new comments unless you have something truly new to add. There are several proposals below, marked by "hooray" emoticons (exploding cone). You can use those icons to skip discussion and just read the proposals, or to find the different proposals so you can vote "thumbs up" or "thumbs down".

Up/downvotes on this bug as a whole are about whether you think that Julia should have any custom infix idiom. Up/downvotes for the specific idea below should go on @Glen-O's first comment. (The bug had 3 downvotes and 1 upvote before that was clarified.)

...

Initial proposal (historical interest only):

The proposal that seems to have won out is:

    a |>op<| b #evaluates (in the short term) and parses (in the long term) to `op(a,b)`

In order to have this work, there are only minor changes necessary:

Optional:

Long term:

(relates to https://github.com/JuliaLang/julia/issues/6946)

diegozea commented 8 years ago

...or maybe an Infix.jl package with macros and nonstandard string literals.

StefanKarpinski commented 8 years ago

We have definitely reached the "working code or GTFO" point in this conversation.

jamesonquinn commented 8 years ago

OK, here's working code then: https://github.com/jamesonquinn/JuliaParser.jl

ETA: Should I reference a specific commit, or is the above link to the latest master OK?

...

(That does not have any of the convenience macros I'd expect you'd want, such as the equivalents for |>, <|, ~, and the @defineinfix from my example above. Nor does it remove deprecate the now-useless special case logic for ~ or the |> operator. It's just the parser changes to get it working. I've tested basic functionality but not all corner cases.

...

I think that the current ugly hack with ~ shows that there's a clear use case for this kind of thing. Using this patch, you'd say @~ when you needed macro behavior; much cleaner, with no special case. Or does anyone seriously believe that ~ is utterly unique and nobody will ever want to do that again?

Note that the patch (it's not a PR yet because it targets the native bootstrapped parser, but for now the scheme one should come first in terms of PRs) is more generally useful than the issue name here. The issue name is "custom infix operators"; the patch gives infix macros, with infix operators only coming as a side effect of that.

The patch as it stands is not a breaking change, but I expect that if this became the plan the next step would be to deprecate the currently-existing ~ and |>, which would eventually lead to breaking changes.

...

Some simple tests added.

tkelman commented 8 years ago

11608 was closed with a pretty clear consensus that many of us do not want infix macros and the one current case of ~ parsing was a mistake (made early on for R compatibility and no other especially good reason). We intend to deprecate and eventually get rid of it, just haven't done it (along with the work of modifying the API for the formula interface in JuliaStats packages) yet.

Macros are now technically generic, but their input arguments are always Expr, Symbol, or literals. So they aren't really extensible to new types defined in packages the way functions (infix or otherwise) are. Possible use cases for infix macros are better served by prefix-annotated macro DSL's or string literals.

jamesonquinn commented 8 years ago

(Sorry I posted prematurely; fixed now.)

In #11608, I see several negative arguments:

===

What would the following transform into? ... y = 0.0 @in@ x == 1.0 ? 1 @in@ 2 : 3 @in@ 4

This was dealt with in the thread:

Cases like that are why I always use parenthesis...

and

same precedent ... apply without being macros: 0.0 in 1 == 1.0 ? 2 in 2 : 3 in 4

===

more functionality to Julia that people have to implement, maintain, test, learn to use, etc.

which is (partially) answered (and seconded) here by:

"headaches for parser developers" is the lowest possible concern.

===

is there no way for 2 packages to simultaneously have definitions for the same macro-operator that could be used together unambiguously in a single user code base?

This is an interesting point. Obviously, if the macro just calls a function, then we have all the dispatch power of the function. But if it is a true macro, as with ~, then it's more complicated. Yes, you could imagine hackish workarounds, like attempting to call it as a function, and catching any errors to use it as a macro... but that's kind of ugliness should not be encouraged.

Still, this is just as much of an issue for any macro. If two packages both export a macro, you simply can't have both with "using".

Is this likely to be more of a problem with infix macros? Well, it depends what people end up using them for:

macro ~(a,b) :(~(:$a, quote($b))) end

Then, the function ~ could dispatch based on the type of the LHS, but the RHS would always be an Expr. This kind of thing would allow the principal uses it has in R (regression and graphing) to coexist, that is, to dispatch correctly despite coming from different packages.

(note: the above has been edited. Initially, I thought that an R expression like a ~ b + c used the binding of b and c through R's lazy evaluation. But it doesn't; b and c are the names of columns in a data frame passed explicitly, not names of variables in local scope that are thus passed in implicitly.)

===

The only way forward here would be to develop an actual implementation.

Which I have done.

===

Macros are now technically generic, but their input arguments are always Expr, Symbol, or literals. So they aren't really extensible to new types defined in packages the way functions (infix or otherwise) are.

This relates to the point above. Insofar as an infix macro calls a specific function, that function is still extensible through dispatch in the normal way. Insofar as it doesn't call a specific function, it is doing something structural/syntactic (such as what |> does now) that should not be extended or redefined. Note that even if it calls a function, the fact that it is a macro can still be useful; for instance, it can quote some of its arguments, or process them into callbacks, or even interact simultaneously with the name and the binding of a variable, in a way that a direct function call cannot.

===

Possible use cases for infix macros are better served by prefix-annotated macro DSL's or string literals.

As was pointed out in the referenced thread:

[Infix is] easier to parse (for English and most western speakers), because our language works that way. (The same thing generally holds for operators.)

For example, which is more readable (and writeable):

select((:emp_id, :last_name) @from employee_tbl @where city == 'NYC' @orderby :emp_id)

or

send(orderby((@where selectfrom((:emp_id, :last_name), employee_tbl) city == 'NYC'), :emp_id))

?

===

Finally:

11608 was closed with a pretty clear consensus

Looks pretty evenly split to me, with "who's gonna do the work" casting the deciding vote. Which is now at least partly moot; I've done the work in JuliaParser and I'd be willing to do it in Scheme if people like this idea.

jamesonquinn commented 8 years ago

This is my last post in this thread, unless there's positive reaction to my hacked juliaparser. It is not my intention to impose my will; just to present my point of view.

I'm arguing in favor of infix macros (a @m b=>@m a b). That doesn't mean I'm not aware of the arguments against. Here's how I'd summarize the best argument against:

Language features start at -100. What do infix macros offer that could possibly overcome that? By their very nature, there is nothing you could accomplish with infix macros that couldn't be accomplished with prefix macros.

My response is: Julia is first of all a language for STEM programmers. Mathematicians, engineers, statisticians, physicists, biologists, machine learning people, chemists, econometricians... And one thing that I think most of those people realize is the usefulness of a good notation. To take an example I'm familiar with in statistics: adding independent random variables is equivalent to convolving PDFs, or even to convolving derivatives of CDFs, but often expressing something using the former can be an order of magnitude more concise and understandable than the latter.

Infix versus prefix versus postfix is, to some degree, a matter of taste. But there are also objective reasons to prefer infix in many cases. Whereas prefix and postfix lead to indigestible precipitates of back-to-back operators like the ones that make Forth programmers sound like German politicians, or the ones that make Lisp programmers sound like a Chomskian caricature, infix puts the operators in what's often the cognitively most natural place, as near to all their operands as possible. There's a reason nobody writes math papers in Forth, and why even German mathematicians use infix operators when writing equations.

Yes, infix macros could be used to write obfuscated code. But existing prefix macros are just as prone to abuse. If not abused, infix macros can lead to much clearer code.

I realize that these are just toy examples but I think the principle is valid.

Could the above be done with nonstandard string literals? Well, the second and third examples would work as NSLs. But the problem with NSLs is that they give you too much freedom: unless you're familiar with the particular grammar, there's no way to be sure even what the tokens of an NSL are, let alone its order of operations. With infix macros, you have enough freedom to do all of the above examples, but not so much that it isn't clear on reading the "good" code what the tokens are and where the implied parentheses go.

StirlingNewberry commented 8 years ago

The it needs certain things to be moved from unknown unknowns to known unknowns. And unfortunately, there is not a mechanism to do this. Your arguments need a structure which does not exist.

stevengj commented 6 years ago

Now that <| is right-associative (#24153), does the initial a |>op<| b proposal work?

Ismael-VC commented 6 years ago

I have made a package for the hack mentioned by Steven in https://github.com/JuliaLang/julia/pull/24404#issuecomment-341570934:

cscherrer commented 6 years ago

I'm not how many potential infix operators this affects, but I'd really like to use <~. The parser won't cooperate -- even if I space things carefully, it wants a <~ b to mean a < (~b).

<- has a similar problem.

Sorry if this is already covered by this or another issue, but I couldn't find it.

JeffBezanson commented 6 years ago

We could potentially require spaces in a < ~b; we've added rules like that before. Then we could add <- and <~ as infix operators.

cscherrer commented 6 years ago

Thanks @JeffBezanson, that would be great! Would this be a special case, or a more general rule? I'm sure there are some details in what the rule should be to allow more infix operators, give clear and predictable code, and break as little as possible existing code. Anyway, I appreciate the help and the quick response. Happy new year!

Liso77 commented 6 years ago

In case that a <~ b will be different than a < ~b I would like to see a =+ 1 as error (or warning at least)

Glen-O commented 4 years ago

I know this is quite an old discussion, and the question asked was asked quite some time ago, but I thought it was worth answering:

Now that <| is right-associative (#24153), does the initial a |>op<| b proposal work?

No, unfortunately, |> still gets the precedence. The update done makes it so that, if you define <|(a,b)=a(b), then you can successfully do a<|b<|c to obtain a(b(c))... but this is a different concept.

o314 commented 4 years ago

Frozen during 2 years, a comment and a commit 2 and 5 days ago !

See Document customizable binary operators f45b6be