Custom infix operators - Githubissues

jamesonquinn commented 8 years ago

There is a discussion at https://groups.google.com/forum/#!topic/julia-dev/FmvQ3Fj0hHs about creating a syntax for custom infix operators.

...

Edited to add note: @johnmyleswhite has pointed out that the comment thread below is an invitation to bikeshedding. Please refrain from new comments unless you have something truly new to add. There are several proposals below, marked by "hooray" emoticons (exploding cone). You can use those icons to skip discussion and just read the proposals, or to find the different proposals so you can vote "thumbs up" or "thumbs down".

Up/downvotes on this bug as a whole are about whether you think that Julia should have any custom infix idiom. Up/downvotes for the specific idea below should go on @Glen-O's first comment. (The bug had 3 downvotes and 1 upvote before that was clarified.)

...

Initial proposal (historical interest only):

The proposal that seems to have won out is:

    a |>op<| b #evaluates (in the short term) and parses (in the long term) to `op(a,b)`

In order to have this work, there are only minor changes necessary:

Put the precedence of <| above that of |>, instead of being the same.
Make <| group left-to-right.
Make the function <|(a,b...)=(i...)->a(i...,b...). (as pointed out in the discussion thread, this would have standalone uses, as well as its use in the above idiom)

Optional:

create new functions >|(a...,b)=(i...)->b(a...,i...) and |<(a,b...)=a(b...) with appropriate precedences and grouping.
- Pipe first means evaluation, and pipe last maintains it as a function, while the > and < indicate which one is the function.
create new functions >>|(a...,b)=(i...)->b(i...,a...) and <<|(a,b...)=(i...)->a(b...,i...) with appropriate precedence and grouping.
create synonyms », ⁍, and(/or) pipe for |>; «, ⁌, and(/or) rcurry for <|; and(/or) lcurry for <<|; with the single-character synonyms working as infix operators.
create an @infix macro in base which does the first parser fix below.

Long term:

teach the parser to change a |>op<| b to op(a,b), so there's no extra overhead involved when running the code, and so that operators can actually be defined in infix position. (This is similar to how the parser currently treats the binary a:b and the ternary a:b:c differently. For maximum customizability, it should do this for matched synonyms, but not for unmatched synonyms, so that e.g. a |> b « c would be still be treated as two binary operators.)
teach the parser to understand commas and/or spaces so that the ellipses in the above definitions work as expected without extra parentheses.

(relates to https://github.com/JuliaLang/julia/issues/6946)

johnmyleswhite commented 8 years ago

Echoing the julia-dev thread, I think it would be useful to quote Stefan's main comment on this proposal:

Just to set expectations here, I don't think there's going to be much in the way of "syntactic innovation" before Julia 1.0. (The only exception I can think of is the new f.(v) vectorized calling syntax.) While having some way of making arbitrary functions behave as infix operators might be nice, it's just not a pressing issue in the language.

As someone who's participated in a good proportion of the history of Julia development, I think it would be better to focus energy on semantic changes rather than syntactic ones. There are lots of extremely important semantic problems left to solve before Julia reaches 1.0.

Note in particular that implementing this feature isn't simply a one-off diff that only the author needs to think about: everyone will have to think about how their work interacts with this feature going forward, so the change actually increases the long-term workload of every person who works on the parser.

jamesonquinn commented 8 years ago

I think that johnmyleswhite's comments are very apropos regarding the "long term" parser changes suggested. But the "minor changes" and "optional" groups are, as far as I can see, pretty self-contained and low-impact.

That is: the parser changes needed to enable the minimal version of this proposal involve only precedence and grouping for normal binary operators, the kind of changes that are more-or-less routine in other cases. A parser developer working on something unrelated would no more need to keep track of this than they need to keep track of the meaning of all of the numerous already-existing operators.

JeffBezanson commented 8 years ago

Personally I find this syntax quite ugly and difficult to type. But I do agree it would be good to have more general infix syntax.

I think the right way to think about this is as a syntax-only issue: what you want is to use op with infix syntax, so defining other functions and operators to get that is roundabout. In other words it should all be done in the parser.

I would actually consider reclaiming | for this, and using a |op| b. Arguably general infix syntax is more important than bitwise or. (We've talked about reclaiming bitwise operators before; they do seem like a bit of a waste of syntax as it is.)

StefanKarpinski commented 8 years ago

a f b is available outside of array concatenation and macro call syntaxes.

jamesonquinn commented 8 years ago

a f b might work, but it seems pretty fragile. Imagine trying to explain to somebody why a^2 f b^2 f c^2 is legal but a f b c and a+2 f b+2 f c+2 aren't. (I know, that last one assumes that the precedence is prec-times, but no matter what the precedence is, this general kind of thing is a concern).

As to a |op| b: initially I favored a similar proposal, a %op% b, as you can see in the google groups thread. But the nice thing about the proposed |> and <| is that they are each individually useful as binary operators, and they naturally combine to work as desired (given the right precedence and grouping, that is.) This means that you can implement this in the short term using existing parser mechanisms, and thus avoid creating headaches for parser developers in the future, as I said in my response to johnmyleswhite above.

So while I like a |op| b and certainly wouldn't oppose it, I think we should look for a way to have two different operators to simplify the required parser changes. If we're going for maximum typeability and not opposed to having | mean "pipe" rather than "bitwise or", then what about a |op\\ b or a |op& b?

StefanKarpinski commented 8 years ago

"headaches for parser developers" is the lowest possible concern.

JeffBezanson commented 8 years ago

"headaches for parser developers" is the lowest possible concern.

As a parser developer, I unequivocally agree with this.

|> and <| are both perfectly good infix operators, but there is zero benefit to implementing general operator syntax using two other operators. And much more needs to be said on just how verbose and unappealing that syntax is.

jamesonquinn commented 8 years ago

there is zero benefit to implementing general operator syntax using two other operators.

To be clear, the long term vision here is that there would be binary f <| y, binary x |> f, and ternary x |> f <| z, where the first one is just a function but the second two are implemented as transformations in the parser.

The idea that this could be implemented using two ordinary functions |> and <| is just a temporary bridge to that vision.

And much more needs to be said on just how verbose and unappealing that syntax is.

That's a fair point. How about replacing |> and <| with | and &? They make sense both as a pair and individually, although they might be a bit jarring to a bit hockey player.

JeffBezanson commented 8 years ago

Stealing both | and & for this would not be a good allocation of ASCII, and I suspect many would prefer the delimiters to be symmetric.

If people want a x |> f <| y ternary operator for other reasons, that's fine, but I think it should be considered separately. I'm not sure the parser should transform |> to a flipped <|. Other similar operators like < don't work that way. But that's also a separate issue.

jamesonquinn commented 8 years ago

Stealing both | and & for this would not be a good allocation of ASCII, and I suspect many would prefer the delimiters to be symmetric.

OK.

I understand that > and < are hard to type. In terms of symmetry and typability on a standard keyboard, I guess the easiest might be something like &% and %&, but that's seriously ugly, R parallel or no. /| and |/ might be worth considering too.

...

I'm not sure the parser should transform |> to a flipped <|

I think you've misunderstood. a |> b should parse to b(a). (The version without special parsing would be ((x,y)->y(x))(a,b), which evaluates to the same thing, but with more overhead.)

JeffBezanson commented 8 years ago

a |> b should parse to b(a)

Ah, ok, got it.

jamesonquinn commented 8 years ago

I think that we could bikeshed about which characters to use for years. I'd trust @StefanKarpinski (as the most senior person in this conversation so far) to make a ruling, and I'd be fine with that. Even if it's something I've argued against (such as a f b.)

Here's some options to see what appeals: a |>op<| b (leaving current |> unchanged) a |{ op }| b (nearby and same shift state on many common keyboards, not too ugly. A bit strange as standalones.) a \| op |\ b or a /| op |/ b or combinations thereof a $% op %$ b (relatively typable, R-inspired. But kinda ugly.) a |% op %| b a |- op -| b a |: op :| b a | op \\ b a | op ||| b a op b

JeffBezanson commented 8 years ago

Stefan is not more senior than me.

jamesonquinn commented 8 years ago

Looks as if you just nominated yourself, then, for BDFL powers on this issue! ;)

rfourquet commented 8 years ago

a @op@ b ?

jamesonquinn commented 8 years ago

I guess my vote is to use all 4 of \|, |\, /|, and |/. Down for evaluation, up for currying; bar towards the function. So: a \| f (or f |/ a) -> f(a) a /| f (or f |\\ a) -> (b...)->f(a,b...) f |\ b (or b //| f) -> (a...)->f(a...,b) and thus: a \| f |\ b (or a /| f |/ b) -> f(a,b) a \| f |\ b |\ c (or a /| b /| f |/ c) -> f(a,b,c)

Each of the 4 main operators, except perhaps |/, is useful on its own. The redundancy would certainly be un-Pythonic, but I think that the logical neatness is Julian. And as a practical matter, you can use whichever version of the infix idiom you find easier to type; they are both equally readable, in that once you've learned one you naturally understand both.

Obviously, it would make equal sense if you swapped all slashes, so that up arrows were for evaluation and down for currying.

I'm still waiting for word from On High (and I apologize for my newbie clumsiness in guessing what that meant). But if anybody taller than this bikeshed makes a ruling, for this or any other version with at least two new symbols, I'd be happy to write a short term patch (using functions) and/or a proper one (using transformations).

JeffBezanson commented 8 years ago

We try to avoid having a BDFL to the extent possible :)

Glen-O commented 8 years ago

I just thought I'd note a few quick things.

First, the other benefit (the "standalone uses") of the notation that is being proposed is that <| can be used in other contexts, in a way that improves readability. For example, if you have an array of strings, A, and want to pad all of them on the left to 10, right now, you have to write map(i->lpad(i,10),A). This is relatively difficult to read. With this notation, it becomes map(lpad<|10,A), which I think you'll agree is significantly cleaner.

Second, the idea behind this is to keep the notation consistent. There's already a |> operator, which exists to change the "fix" of a function call from prefix to postfix. This just extends the notation.

Third, the possibility of using direct infix as a f b has a bigger problem. a + b and a * b would end up having to have the same precedence, since + and * are function names, and it would be infeasible for the system to have variable precedence. That, or it would have to treat existing infix operators differently, which could cause confusion.

StefanKarpinski commented 8 years ago

For example, if you have an array of strings, A, and want to pad all of them on the left to 10, right now, you have to write map(i->lpad(i,10),A). This is relatively difficult to read. With this notation, it becomes map(lpad<|10,A), which I think you'll agree is significantly cleaner.

I emphatically do not agree. The proposed syntax is – forgive me – ASCII salad, verging on some of the worst offenses of Perl and APL, without precedent in other languages to give the casual reader a clue of what's happening. The current syntax, while a few characters longer (five?), is pretty clear to anyone who knows that i->expr is a lambda syntax – which it is in a large and growing set of languages.

JeffBezanson commented 8 years ago

a + b and a * b would end up having to have the same precedence, since + and * are function names, and it would be infeasible for the system to have variable precedence. That, or it would have to treat existing infix operators differently, which could cause confusion.

I don't think this is a real problem; we can just say what the precedence of a f b infix is, and keep all existing precedence levels as well. This works because precedence is determined by the name of the function; any function called "+" will have "+" precedence.

StefanKarpinski commented 8 years ago

Yes, we already do this for the 1+2 in 1+2 syntax, and it hasn't been a problem.

Glen-O commented 8 years ago

I don't think this is a real problem; we can just say what the precedence of a f b infix is, and keep all existing precedence levels as well. This works because precedence is determined by the name of the function; any function called "+" will have "+" precedence.

I didn't mean it's difficult to write the parser to make it work. I meant it leads to consistency issues, hence me saying "or it would have to treat existing infix operators differently, which could cause confusion". Among other things, consider that ¦ and ∥ don't look all that different in concept, yet one is a predefined infix operator, while the other is not.

I emphatically do not agree. The proposed syntax is – forgive me – ASCII salad, verging on some of the worst offenses of Perl and APL, without precedent in other languages to give the casual reader a clue of what's happening. The current syntax, while a few characters longer (five?), is pretty clear to anyone who knows that i->expr is a lambda syntax – which it is in a large and growing set of languages.

Perhaps I should be clearer on what I'm saying. I'm saying that being able to describe the operation as "lpad by 10" is a lot clearer than i->lpad(i,10) makes it. And in my view, lpad<|10 is the nearest you can get to that, in a non-context-specific form.

Maybe it would help if I describe where I'm coming from. I'm a mathematician and mathematical physicist, first and foremost, and "lambda syntax", while sensible from a programming standpoint, isn't the clearest for those who are less experienced in programming. Julia is, as I understand it, primarily aimed at being a scientific computing language, hence the strong resemblance to MATLAB.

I must ask - how is lpad<|10 any more "ASCII salad" than, say, x|>sin|>exp? Yet the |> notation was added. Compare with, say, bash scripting, where | is used to pass the argument on the left to the command on the right - if you know it's called "pipe", it makes a little more sense, but if you're not skilled in programming, it's not going to make sense. In that regard, |> actually makes more sense, as it looks vaguely like an arrow. And then <| is a natural extension to the notation.

Compare with some of the other suggestions, such as %func%, which does have a precedent in another language, but which is completely opaque for people who don't have extensive knowledge of programming in the language.

Mind you, I looked back a bit at one of the older discussions, and I see that there HAS been a notation used in another language that would be quite nice, in theory. Haskell apparently uses a |> b c d to represent b(a,c,d). If spaces following a function name allowed you to specify "parameters" in this way, it would work nicely - map(lpad 10,A). The only problem arises with the unary operators - map(+ 10,A) would produce an error, for instance, as it would interpret at "+10" instead of i->+(i,10).

jamesonquinn commented 8 years ago

On a f b: the precedence issues may not be as bad as Glen-O suggested, but unless user-defined infix functions have the very lowest precedence, they do exist. Say, for the sake of argument we give them prec-times. In that case, a^2 f b^2 => f(a^2,b^2) a+2 f b+2 => a+f(2,b)+2 a^2 f^2 b^2 => (f^2)(a^2,b^2) a f+2 b => syntax error?

This is all a natural consequence of how you'd write the parser, so it's not particularly a headache in that sense. But it's not particularly intuitive for the casual user of the idiom.

On the usefulness of a curry idiom I agree with Glen-O that (i)->lpad(i,10) is simply worse than lpad<|10 (or, if we so choose, lpad |\ 10, or whatever). The i is an entirely extraneous cognitive burden and potential source of errors; in fact, I swear that when I was typing that just now, I unintentionally typed (i)->lpad(x,10) initially. So, having an infix curry operation seems to me like a good idea. However, if that's the intention, then whatever infix idiom we settle on, we can create our own curry operation. If it's a f b, then something like lpad rcurry 10 would be fine. The point is readability, not keystrokes. So I think this is only a weak argument for <|.

On a |> b c d I like this proposal a lot. I think that we could make it so that |> accepted spaces on either side, so a b |> f c d => f(a,b,c,d).

(Note: If both my suggestion of a b |> f c d and Glen-O's of map(lpad 10,A), this does create a corner case: (a b) |> f c d => f((x)->a(x,b),c,d). But I think that's tolerable.)

This still has similar issues in terms of operator precedence as a f b. But somehow I think they're more tolerable if you can at least talk about them in terms of the precedence of the operator |>, rather than being the precedence of the ternary operator of `with `.

tkelman commented 8 years ago

Try lpad.(["foo", "bar"], 10) on 0.5. The existing |> isn't exactly loved by all.

jamesonquinn commented 8 years ago

@tkelman: I see the issue, but what's your point? You think we should fix the existing |> before we add extra uses for it? If so, how?

tkelman commented 8 years ago

I personally think we should get rid of the existing |>.

Glen-O commented 8 years ago

Try lpad.(["foo", "bar"], 10) on 0.5. The existing |> isn't exactly loved by all.

I think you've missed the point. Yes, the func.() notation is nice, and bypasses the issue in some situations. But I use the map function as a simple demonstration. Any function that takes a function as argument would be benefited by this setup. As an example, purely to demonstrate my point, you might want to sort some numbers based on their least common multiple with some reference number. Which looks neater and easier to read: sort(A,by=i->lcm(i,10)) or sort(A,by=lcm 10)?

jamesonquinn commented 8 years ago

I'd like to note once again that any way to define infix operators will allow creating an operator that does what Glen-O wants <| to do, so that at worst he'll be able to write something like sort(A,by=lcm |> currywith 10). The point of this page is to discuss how to make some a...f...b => f(a,b). I understand that whether the existing |> or the proposed <| are worthwhile operators has some relationship to that point, but let's try not to get too sidetracked.

Personally, I think the a |> b c proposal is the best one so far. It follows an existing convention from Haskell; it is logically related to the existing |> operator; it is both reasonably readable and reasonably easy-to-type. The fact that I feel that it naturally extends to other uses is secondary. If you disagree, please at least mention your feelings on the core idiom, not just the proposed secondary uses.

JeffBezanson commented 8 years ago

I meant it leads to consistency issues, hence me saying "or it would have to treat existing infix operators differently, which could cause confusion".

I agree it's difficult to decide on the precedence for a f b. For example in clearly benefits from comparison precedence, but it's quite likely many functions used as infix would not want comparison precedence. However I don't see any consistency issue. Different operators have different precedence. Adding a f b doesn't force our hand to give + and * the same precedence.

jamesonquinn commented 8 years ago

Note that |> already has precedence adjacent to comparison. For any other precedence, frankly, I think parentheses are fine.

If you don't agree with me, and if we were using a |> f b, then there could be similar operators |+>, |*>, and |^>, which worked the same as |>, but had the precedence of their central operator. I think that's overkill but it's a possibility.

toivoh commented 8 years ago

Another possibility for solving the precedence issue is to use a syntax for custom infix operators that includes parentheses of some kind, eg (a f b).

Sacha0 commented 8 years ago

I must ask - how is lpad<|10 any more "ASCII salad" than, say, x|>sin|>exp? Yet the |> notation was added.

I imagine that @tkelman argues

we should get rid of the existing |>.

in part because both lpad<|10 and x|>sin|>exp venture into ASCII-salad territory :).

jamesonquinn commented 8 years ago

I think @toivoh's (a f b), with mandatory parens, is the best proposal so far.

jamesonquinn commented 8 years ago

Related to https://github.com/JuliaLang/julia/issues/11608 (and thus also https://github.com/JuliaLang/julia/issues/4882 and https://github.com/JuliaLang/julia/pull/14653): If (a f b) => f(a,b), then it would be make sense if (a @m b) => (@m a b). This would allow replacing the existing special case macro logic for y ~ a*x+b with normal (and thus much more transparent) (y @~ a*x+b).

Also, the "parens required" could be the preferred idiom for concise infix definitions. Instead of saying (to use a stupid example) a + b = string(a) * string(b), you'd be encouraged (by lint tools, or by compiler warnings) to say (a + b) = string(a) * string(b). I realize that this is not actually a direct consequence of choosing the "parens required" option for infix, but it is a convenient idiom that would allow us to warn the people using infix on the LHS mistakenly but lay off of the people doing it on purpose.

oxinabox commented 8 years ago

My feel is currently that if you are applying a function infix (rather than prefix), then it is an operator, and should look and act like an operator.

And we have support for infix operators defined using unicode. since https://github.com/JuliaLang/julia/issues/552

I guess it might be nice to have that exented so you can add the keywords as in the orginial suggestion. So we could have, for example, 1 ⊕₂ 1 == 0 Being able to have arbitrary names for your infix seems a bit excessive.

jamesonquinn commented 8 years ago

should look and act like an operator.

I agree that there should be strong naming conventions for infix operators. For instance: one character of unicode, or ends in a preposition. But those should be conventions that develop organically, not requirements enforced by the compiler. Certainly, I don't think that #552 is the end of the story; if there are dozens of hard-coded operators, there should be a way to add more programmatically, if only for prototyping new features.

...

For me, the (a f b) (and (a @m b)) proposal is head and shoulders above the rest of the proposals in this bug. I'm almost tempted to make a patch.

(a f b)=>f(a,b) (a f b c d)=>f(a,b,c,d) (a f)=>syntax error (a+2 f+2 b+2)=>(f+2)(a+2,b+2) (t1=a t2=f t3=b)=>(t1=f)((t2=a),(t3=b)) (space has lowest possible precedence, as in macros)

...

Would it be inappropriate for me to submit a patch?

diegozea commented 8 years ago

I didn't understand the last two cases:

(a+2 f+2 b+2)=>(f+2)(a+2,b+2) (t1=a t2=f t3=b)=>(t1=f)((t2=a),(t3=b))

omus commented 8 years ago

I find the (a f b c d) syntax very strange. Since 1 + 2 + 3 can be written as +(1,2,3) then shouldn't f(a,b,c) be written as (a f b f c)?

Overall I'm personally not convinced Julia should support custom infix operators beyond what is currently allowed.

Glen-O commented 8 years ago

I can see two problems with (a f b c d).

First, it will be difficult to read when you've got a more complicated expression - one of the reasons why brackets can be frustrating is that it can often be hard to tell, at a glance, which brackets pair with which other brackets. That's why infix and postfixing (|>) operators are desirable in the first place. Postfixing in particular is liked because it allows a nice, neat left-to-right reading without having to deal with brackets every time.

Second, it leaves no way to nicely do things like make it elementwise. My understanding is that f.(a,b) is going to be a notation in 0.5 to make f operate elementwise on its arguments with broadcasting. There will be no neat way to do the same thing with the infix notation, if it's (a f b). At best, it would have to be (a .f b), which in my view loses the niceness of symmetry that .( affords with .+ and .*.

Compare, for example, the case of wanting to use the example from Haskell. shashi on #6946 made the point that has an equivalent here. In Haskell, you would write circle 10 |> move 0 0 |> animate "scale" "ease". Using this notation, this becomes ((circle(10) move 0 0) animate "scale" "ease"), which isn't any clearer than animate(move(circle(10),0,0),"scale","ease"). And if you wanted to copy your circle to multiple places, using |> notation, you might have circle 10 .|> copy [1 15 50] [3 14 25]. In my view, that is the neatest way to implement the idea - and then, brackets do their normal role of dealing with order of operation issues.

And as I've pointed out, a|>f b c has the benefit of also having a natural extension allowing the same notation to have more use - f b c would parse as "function f with parameters b and c set), and thus would be equivalent to i->f(i,b,c). This allows it to work not just for infixing, but for other situations where you might want to pass a function (especially an inbuilt function) with parameters (noting that the standard is to have the object of the function first).

The |> also makes it clear which one is the function. If you had, say, (tissue wash fire dirty metal), it would be quite hard to, at a glance, recognise wash as the function. On the other hand, tissue|>wash fire dirty metal has a big indicator saying "wash is the function".

jamesonquinn commented 8 years ago

Some of the latest objections sound to me like saying "but you could abuse this feature!" My response is: of course you could. You could already write utterly unreadable code using macros if you wanted. The parser's job is to enable legit uses; to stop abuses, we have conventions/idioms and in some cases delinters. Specifically:

I didn't understand the last two cases:

These are not meant in any way to be an example to follow; they are just showing the natural consequences of the precedence rules. I think both of the last two examples would qualify as abusing the syntax, though (a^2 ಠ_ಠ b^2) => ಠ_ಠ(a^2,b^2) is clear enough.

shouldn't f(a,b,c) be written as (a f b f c)

My proposal of (a f b c d) was, frankly, an afterthought. I think it makes sense, and I could come up with examples where it's useful, but I do not want to hang up this proposal on this issue if it's controversial.

[For instance:

f is an "object method" of an object a, probably complicated, using b, c, and d, probably simpler.
f is a "naturally broadcast" method like push!]

While (a f b f c) would make sense if f were like +, I think that most operators are not actually like +, so it should not be our model.

it will be difficult to read when you've got a more complicated expression

Again, my answer would be, "so don't abuse it".

Say we want some way to write a complicated expression like a / (b + f(c,d^e)) with f infix. In @toivoh's proposal, that would be a / (b + (c f d^e)). In Haskell-like usage, it would be a / (b + (c |> f d^e)) or at "best", if |> precedence was changed to fix this one particular example, a / (b + c |> f d^e). I think that @toivoh's is easily as good here.

(tissue wash fire dirty metal)

I think the solution to this is strong naming conventions for infix operators. For instance, if there were a convention that infix operators should one character of unicode, or end in a preposition or underscore, then the above would be something like (tissue wash_ fire dirty metal) which is as clear as that expression could ever hope to be.

...

elementwise

This is a valid concern. (a .f b) is a bad idea, because it could be read as ((a.f) b). My first suggestion is (a ..f b) but it doesn't make me very happy.

circle 10 |> move 0 0 |> animate "scale" "ease"

I've used jquery, so I definitely see the advantage of function chaining like that. But I think that it's not the same issue as infix operators. Using the (a f b) proposal, you could write the above as:

circle 10 |> (move <| 0 0) |> (animate <| "scale" "ease")

... which is not quite as terse as the Haskell version, but still pretty readable.

diegozea commented 8 years ago

Maybe it can be limited to only three things inside the (): (a f (b,c)) .(a f (b,c)) using the operator .(

jamesonquinn commented 8 years ago

Finally, a response to the general point:

Overall I'm personally not convinced Julia should support custom infix operators beyond what is currently allowed.

Obviously we're all entitled to our opinions. (I'm not clear whether the thumbs-up referred to that part of the comment, but if so, that goes triple.)

But my counterarguments are:

Julia already has dozens of infix operators, many of them extremely niche. It is inevitable that more will be proposed. When somebody says "how can you have ⅋ but not §?", I'd much rather respond "do it yourself" and not "wait until the next version is widely adopted".
Something like (a § b) is eminently readable, and the syntax is lightweight enough to learn from one or two examples.
I'm not the first person to raise this issue, and I won't be the last. I understand that language designers should be very very skeptical of creeping (mis)features, because once you add an ugly feature it's basically impossible to fix later. But as I said above, I think (a f b) is clean enough that you won't regret it.

oxinabox commented 8 years ago

I'm really not sure on the clarity of (a f b)

Here is a possible use-case: select((((:emp_id, :last_name) from employee_tbl) where (:city, == ,'indianapolis')) orderby :emp_id));

This is certainly viable use of infix functions. The select function is either the identity function, or sends the built query to the database.

Is this clear code? I just don't know.

jamesonquinn commented 8 years ago

.(a f b)

Yes, that makes sense. But it's not very readable.

Is (a @. f b) more readable? Because the @. macro to enable that would be a simple one-liner.

[[[Come to think of it, if we allowed infix macros without requiring parens, @Glen-O could use them to do what he wants: circle(10) @> move 0 0 @> animate "scale" "ease"=>@> (@> circle(10) move 0 0) animate "scale" "ease" =macro> animate(move(circle(10),0,0),"scale","ease"). I think that solution is uglier than (a f b), but at least it would resolve this overall bug in my eyes.]]]

...

select((((:emp_id, :last_name) from employee_tbl) where (:city, = ,'indianapolis')) orderby :emp_id);

I would definitely rather use a macro for "where" so that the conditional expression didn't have to be strangely quoted. So:

select((((:emp_id, :last_name) from employee_tbl) @where city == 'indianapolis') orderby :emp_id);

The parens are mildly annoying, but on the other hand I see no reasonable way for the parser to deal with this kind of expression without them.

jamesonquinn commented 8 years ago

select((((:emp_id, :last_name) from employee_tbl) @where city == 'indianapolis') orderby :emp_id);

The parens are mildly annoying, but on the other hand I see no reasonable way for the parser to deal with this kind of expression without them.

On second thought, the precedence in that expression is just right to left. So, using infix macros, it could just as well be:

select((:emp_id, :last_name) @from employee_tbl @where city == 'NYC' @orderby :emp_id)

or even:

@select (:emp_id, :last_name) @from employee_tbl @where city == 'NYC' @orderby :emp_id

So while I still like (a f b), I'm beginning to see that infix macros are a good answer too.

Here's the full proposal through examples, followed by the advantages and disadvantages:

main uses:

a @m b => @m a b
a @m b c => @m a b c
a @m b @m2 c => @m2 (@m a b) c
@defineinfix f; a @f b => macro f(a,b...) :(f($a,$b...)) end; @f a b => f(a,b)

Corner cases: (not intended to be good code, just to show how the parser would work)

t1=a @m t2=b t3=c => @m t1=a t2=b t3=c (though this is not good programming style)
t1 + a @m t2 + b => @m t1+a t2+b (though this is not good programming style)
a b @m c => syntax error (??)
a @m b [c,d] => please don't, but @m a b[c,d] (ETA: Nope, with the patch this comes out as @m a b ([c,d]) which is probably better.)
a @m b ([c,d]) => @m a b ([c,d])
[a @m b] => bad style, please use parentheses to clarify, but [a (@m b)] (??)
a @> f b => @> a f b => f(a,b)
@outermacro a b @m c d => @outermacro a (@m b c d)

Advantages:

define infix macros, get infix functions for free (with one-time overhead of macro evaluation. That's not quite as low-overhead as parser magic, but much better than having extra function calls every evaluation)
can lead to powerful DSLs, as seen in the SQL-like example above
Removes the need for a separate |> operator, since that's a one-liner macro. Similarly for <| and the rest of @Glen-O's proposals.
explicit, so very low risk of being used by accident, unlike (a f b)
As shown, the @defineinfix macro could allow shorthand use for functions not macros.

(Minor) Disadvantages:

precedence and grouping seem to work well in most cases with RtoL, but there would be exceptions which would require parens.
I think that a @> f b or even a @f b isn't quite as readable as (a f b) (though they're not too horrible either.)

johnmyleswhite commented 8 years ago

Given how active this thread has become, I'm going to remind people of my original concern with this topic: issues about syntax often generate a huge amount of activity, but that amount of activity is generally out of proportion to the long-run value of the change being debated. In large part, that's because threads about syntax end up being close to pure arguments about tastes.

jamesonquinn commented 8 years ago

that amount of activity is generally out of proportion

I'm sorry. I'm probably guiltiest of getting into back-and-forth.

On the other hand, I think this thread has clearly made "useable" progress. Either of the latest suggestions (a f b) or [a @> f b, with a @f b definable as a shortcut] is clearly superior in my view to the earlier suggestions like a %f% b or a |> f <| b.

Still, I think that further back-and-forth comments are probably not going to make any further progress, and I'd encourage people to use thumbs-up or thumbs-down from now on unless they have something truly new to suggest (that is, not just an orthographic change to an existing proposal). I've added "hooray" emoticons (exploding cone) to the "votable proposals". If you believe that we should not have a specialized syntax for arbitrary functions in infix position, then downvote the bug as a whole.

...

ETA: I think that this discussion is now mature enough to get a decision tag.

oxinabox commented 8 years ago

For reference, (and I expected someone else to point it out). If your want to embed SQL-like syntax, the right tool for the job is Nonstandard String Literals, I think. Like all macros they have access to all variables in scope when called, and they allow you to specify your own DSL, with your own choice of priority, and they run at compile time.

select((((:emp_id, :last_name) from employee_tbl) where (:city, == ,"indianapolis")) orderby :emp_id));

Is better written

sql"SELECT emp_id, last_name FROM employee_tbl WHERE city == 'indianapolis' ORDER BY emp_id"

Nonstandard string literals are a seriously powerful bit of syntax. I can't find any good examples of them being used for embedding a DSL. But they can do it.

And in this case I think the result is a lot cleaner than any infix operation that can be defined. Though it does have the overhead of having to write your own microparser/tokenizer.

I really don't see the need to a decision tag. This has no implementation as a PR, nor any usable prototype. that lets people test it out. Contrast to https://github.com/JuliaLang/julia/issues/5571#issuecomment-205754539 with its 8 usable prototypes

My feels towards this go up and down everytime I read the thread. I don't think I'll really know til I try it. And right now I don't even know what I would use it for. (Unlike some of the definitions for |> and <| which I have used in F#)

jamesonquinn commented 8 years ago

SQL-like syntax, the right tool for the job is Nonstandard String Literals

Whether or not SQL is best done with NSLs, I think there is a level of DSL that is complex enough that inline macros would be very helpful, but not so complex that it's worth writing your own microparser/tokenizer.

right now I don't even know what I would use it for. (Unlike some of the definitions for |> and <| which I have used in F#)

The inline macro proposal would enable people to, among other things, roll their own |>-like or <|-like macros, so you could use it for whatever you've done in F#.

(I don't want to get into back-and-forth bikeshedding arguments, but I was responding anyway because of the below, and I do think that the inline-macro proposal kills multiple birds with one relatively-smooth stone.)

I really don't see the need to a decision tag.

I asked earlier if it was appropriate for me to create a parser patch, and nobody answered. The only word on that so far is:

I don't think there's going to be much in the way of "syntactic innovation" before Julia 1.0.

Which would seem to argue against making a patch now, as it might just sit around and bit-rot. However, now you're saying that it's not worth making a decision on this (including the decision not to decide right now?) unless we have an "implementation as a PR [or] usable prototype".

What does that mean? (What is a PR?) Would a macro that used the character '@' instead of the token @ do the job, so that @testinline a '@'f b=>@f(a, b)? Or should I submit a patch to julia-parser.scm? (I've actually begun initial looking at writing such a patch, and it looks as if it should be simple, but my Scheme is very rusty.) Do I need to create test cases?

Right now, there are 13 participants in this bug. There are a total of 5 people who have voted on one or more of the proposals and/or downvoted the bug itself, and only one of those (me) did so after the inline macro proposal was on the table. That doesn't make me confident that it's time for prototyping yet. When the number of people who have voted since the last serious proposal is more like half the number of participants, I hope some kind of rough consensus will be becoming clear, and then it will be time for prototyping and testing and deciding (or, as the case may be, giving up on the idea).

oxinabox commented 8 years ago

By "implementation as a PR [or] usable prototype". I mean something that can be played with. So it can be seen how it feels in practice.

A PR is a pull request, so a patch is the term you've been using.

If you made a PR it could be downloaded and tested. More simply though if you implemented it with macros or Nonstardard string literals, it could be tested without having to build julia.

Like it ain't my call, but I doubt I'll be bale to make up my own opinion without something I can play with.

Also +1 to not going to back and forth bike sheding.

JuliaLang / julia

Custom infix operators #16985