JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.71k stars 5.49k forks source link

headless anonymous function (->) syntax #38713

Open rapus95 opened 3 years ago

rapus95 commented 3 years ago

Since https://github.com/JuliaLang/julia/pull/24990 stalls on the question of what the right amount of tight capturing is

Idea

I want to propose a headless -> variant which has the same scoping mechanics as (args...)-> but automatically collects all not-yet-captured underscores into an argument list. EDIT: Nesting will follow the same rules as variable shadowing, that is, the underscore binds to the tightest headless -> it can find.

Before After
lfold((x,y)->x+2y, A) lfold(->_+2_,A)
lfold((x,y)->sin(x)-cos(y), A) lfold(->sin(_)-cos(_), A)
map(x->5x+2, A) map(->5_+2,A)
map(x->f(x.a), A) map(->f(_.a),A)

Advantage(s)

In small anonymous functions underscores as variables can increase the readability since they stand out a lot more than ordinary letters. For multiple argument cases like anonymous functions for reduce/lfold it can even save a decent amount of characters. Overall it reads very intuitively as start here and whatever arguments you get, just drop them into the slots from left to right

      -> ---| -----|
            V      V
lfold(->sin(_)-cos(_), A)

Sure, some more complex options like reordering ((x,y)->(y,x)), ellipsing ((x...)->x) and probably some other cases won't be possible but if everything would be possible in the headless variant we wouldn't have introduced the head in the first place.

Feasibility

1) Both, a leading -> and an _ as the right hand side (value side) error on 1.5 so that shouldn't be breaking. 2) Since it uses the well-defined scoping of the ordinary anonymous functions it should be easy to 2a) switch between both variants mentally 2b) reuse most of the current parser code and just extend it to collect/replace underscores

Compatibility with #24990

It shouldn't clash with the result of #24990 because that focuses more on ~tight single argument~ very tight argument cases. And even if you are in a situation where the headless -> consumes an underscore from #24990 unintentionally, it's enough to just put 2 more characters (->) in the right place to make that underscore once again standalone.

Further Explorations

This proposal can optionally be combined with https://github.com/JuliaLang/julia/pull/53946. Additionally, the following links to comments further down explore different ideas to stretch into, all adding their own value to different parts of the ecosystem. Alternative explorations: https://github.com/JuliaLang/julia/issues/38713#issuecomment-1436118670 https://github.com/JuliaLang/julia/issues/38713#issuecomment-1188977419

aplavin commented 1 year ago

This basically proposes $ instead of _ as the single argument placeholder, right?

@aplavin no, if you look at the code you quoted they are suggesting that -> $a means x -> x.a, not that $ is used as an alternative for _.

Whew, I missed that there's no dot between $ and a indeed! Why?.. Well, $a is even worse IMO: it's totally ad-hoc special syntax, and doesn't work with function-based data accessors like -> max(real(_), 0) would.

rapus95 commented 1 year ago

I personally don't like that the idea is to dedicate a syntax to replace x with an underscore. Because that's all, that your proposal could do. IMO there should be more benefit in this. And I'd like to have something that's compatible with DataFrames.jl and the higher order functions (especially able to create binary functions)

pablosanjose commented 1 year ago

I'm not sure I understand how the proposed -> delimiter disambiguates the boundaries of the lambda. Unless I'm very confused (which could well be) I should parsef(-> 2_, 2) as f(x->2x, 2), f(->_,_) as f((x,y) -> (x,y)) (or f(x-> (x,x)) depending on who you ask) , g |> f(->_, _) as g |> z-> f(x->x, z) and g |> f(->_, 2) as g |> f(x->x, 2) ? Or do commas act also as rightmost boundaries?

Although I really would like to have concise lambdas, I find that anything that is not 100% obvious and transparent (why should two visually identical _ represent different arguments?), would only make Julia syntax worse for most people (who not only write, but read code). My 2 cents.

EDIT: Mmm, perhaps I indeed misunderstood, and f(->_,_) should be a syntax error, just like x->x,x.

tpapp commented 1 year ago

how the proposed -> delimiter disambiguates the boundaries of the lambda

My understanding is that it is the same as (args...) -> body... now, just without the arguments.

and f(->,) should be a syntax error, just like x->x,x.

But that isn't:

julia> let x = 1
       x -> x,x
       end
(var"#5#6"(), 1)
rapus95 commented 1 year ago

I'll keep producing ideas hoping there'll once be one that serves most of us. what about:

->_+_ == x->x+x
1->_+_ == error
2->_+_ == (x, y)->x+y
3->_+_ == (x, y, z)->x+y

Then I'd have my binaries by prepending a 2 and also something that works with DataFrames.jl while you still have your syntax available

aplavin commented 1 year ago

I personally don't like that the idea is to dedicate a syntax to replace x with an underscore.

But you proposed to replace x. with $ :)

I don't think this kind of syntax is necessary at all in Julia, agree with @tpapp and others above in that. Even more so with the recent @MasonProtter's finding about macros. @-> _.a + _.b is basically as concise and readable as -> _.a + _.b, especially if a single-character macro name is chosen instead. Everyone reading Julia code is already used to macros preceded with @ and affecting following code in some way.

If this macro behavior is indeed officially supported, it may become popularized and see more usage in packages.

Because that's all, that your proposal could do.

I don't propose to add -> _.a + _.b syntax! It's just when comparing -> syntax variants that use the underscore to mean a single thing vs multiple different things, I definitely prefer the former. Empirical evidence shows the same: many packages use _ to mean the same thing in the same scope, vs none where _ means different things each time.

And I'd like to have something that's compatible with DataFrames.jl and the higher order functions

DataFrames often choose unique special syntax that has no equivalent elsewhere in Julia. Design of that and other packages isn't set in stone, and can be influenced by Julia changes. For example, at some point they may decide to replace transform(df, [:a, :b] => ByRow((a,b) -> a+b)) with transform(df, [:a, :b] => ByRow(->_.a+_.b)) to be more consistent with other tables/collections.

In higher-order data processing functions with regular Julian interface, "underscore = the only argument" is most often convenient.

tpapp commented 1 year ago

I'll keep producing ideas in the hope there'll once be one that serves most of us.

My problem with this family of proposals is the meandering brainstorming that they degenerate to. Discussion is fine, and people should of course comment if they want to make a point, but if that leads to major changes a new issue should be opened IMO.

Reading a comment stream which has various proposals floating around without a resolution is very confusing, especially in a discussion that goes on for years.

rapus95 commented 1 year ago

I personally don't like that the idea is to dedicate a syntax to replace x with an underscore.

But you proposed to replace x. with $ :)

yes, as a complementary proposal to extend the original one for better synergy. Not as the only feature

MasonProtter commented 1 year ago

I think this syntax should be a macro for the following reasons:

  1. It can be a macro.
  2. It being a macro would let it do more controversial things without needing buy-in from everyone, and without overcomplicating the language base syntax. For example,
    • We can use _1, _2, _14, etc. to refer to the 1st, 2nd, and 14th arguments of the function respectively
    • Things like the proposal to use $a to mean _.a could be considered whereas it is pretty much out of the question for surface syntax.
  3. This being a macro would make it less conflicting and less problematic to have A{_} mean A{T} where T.
bramtayl commented 1 year ago

I've had a somewhat similar macro in LightQuery for a few years and it didn't seem to catch on (I haven't really been maintaining it). It uses double underscore instead of _2, but I kind of like the numbers better. Having a officially supported macro in base (or ideally, two, @-> and @|>) would be pretty nice.

MasonProtter commented 1 year ago

Ah yes, I had forgotten about the one in LightQuery.jl. I think so far as I recall, people didn't like that you had to write

map(@_(1 + _), v)

instead of

map(@_ 1 + _, v)

since @_(1+_) is actually longer than x->1+x, so maybe that was a barrier to adoption?

bramtayl commented 1 year ago

map(@_ 1 + _, v) is nicer for sure!

tpapp commented 1 year ago

@MasonProtter:

being a macro would let it do more controversial things without needing buy-in from everyone

Indeed, not having to put this in the core language at this point would allow fuller exploration of this syntax without the usual constraints of having stuff in Julia. So this would be a great advantage.

Thanks for making a package!