JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.39k stars 5.46k forks source link

Allow type variables with `do` syntax #54915

Closed PatrickHaecker closed 3 weeks ago

PatrickHaecker commented 2 months ago

The do syntax should support type variables, so that something like

myfunction(bound_parameter) do free_parameter::T where T
    # T should be available here
end

or at least

myfunction(bound_parameter) do (free_parameter::T) where T
    # T should be available here
end

should behave the same as in regular functions regarding T .

There are three reasons why the do syntax should allow type variables:

This is discussed in a Question and a Suggestion. A change proposal is already provided by sgaure (without pull request).

nsajko commented 2 months ago

IMO an issue on an issue tracker should be self-contained. It's OK to link to various previous discussions for context, but you can't expect anyone to click on the provided links to get at the actual meat of the matter.

sgaure commented 2 months ago

I did make a PR to JuliaSyntax. It's a one-liner.

https://github.com/JuliaLang/JuliaSyntax.jl/issues/437

PatrickHaecker commented 2 months ago

IMO an issue on an issue tracker should be self-contained.

Thanks for the hint. I expected to have the most relevant information in the heading and the first sentence, but I agree that this was very terse. I hope it's now more clear with the explicit syntax proposal.

JeffBezanson commented 1 month ago

If this doesn't cause any problems in the parser, looks like an obvious win to me.

c42f commented 1 month ago

Which version is this proposal suggesting?

Requiring the parens seems problematic, because do with parens currently means argument destructuring:

julia> g(f, args...) = f(args...)
g (generic function with 1 method)

julia> g([1,2]) do (x, y)
           @info "Look, they're unpacked" x y
       end
┌ Info: Look, they're unpacked
│   x = 1
└   y = 2

It would not be good if the parentheses mean something different when the where was added.

c42f commented 1 month ago

The fact that argument destructuring is supported in this case is arguably quite confusing and the best long term solution may be to make parentheses irrelevant in do argument lists?

For example here's a very prolific and experienced Julia user getting confused by this: https://github.com/JuliaLang/julia/issues/47661

c42f commented 1 month ago

Making progress toward deprecating destructuring for this case would be a long term prospect I suspect. We could assess the current state by looking at how many times it occurs in the package ecosystem.

MasonProtter commented 1 month ago

The fact that argument destructuring is supported in this case is arguably quite confusing and the best long term solution may be to make parentheses irrelevant in do argument lists?

IMO that's not viable. People write things like

map(Iterators.product(a, b)) do (x, y)
    #...
end

all the time.

StefanKarpinski commented 1 month ago

The syntax I would consider reasonable here would be:

f(a, b) do x::T, y::T where T
    # body
end

I'm not sure why you'd need to do this though...

JeffBezanson commented 1 month ago

Yeah I agree the way parens work here was probably a mistake; f() do x, y and f() do (x, y) should have been the same, both 2-argument functions, with f() do ((x,y),) for destructuring. As it is I'm not sure we can do anything about this.

c42f commented 1 month ago

We can possibly have the syntax @StefanKarpinski suggested - my guess would be that this is not used in practice for simple reasons of obscurity. However, it's technically breaking and we'd need to guess at whether it's actually breaking based on usage in General :-/

Another option could be to finally do https://github.com/JuliaLang/julia/pull/32071 and make this the only supported way to express where and do together, for now. It's not great, but it might do?

In either of these cases, I'd favor a long-term plan to make the current do+destructuring syntax a warning and eventually change it in "the mythical Julia 2.0". I like that syntax-appreciators like @MasonProtter like it and do understand it perfectly well! But it seems high on confusion and low on utility for the average user.

jariji commented 1 month ago

I'm concerned that a syntax warning wouldn't be actionable because all do ... syntax already has a meaning. For example, in

julia> map(Iterators.product(1:2, 10:10:20)) do (x, y)
           x,y
       end
2×2 Matrix{Tuple{Int64, Int64}}:
 (1, 10)  (1, 20)
 (2, 10)  (2, 20)

SYNTAX WARNING: Destructuring syntax will change. For this use `do ((x,y),)`.

The user is instructed to use do ((x,y),). But the user can't do that, because do ((x,y),) already means destructuring yet another level:

julia> map(Iterators.product(1:2, 10:10:20)) do ((x, y),)
           x,y
       end
ERROR: BoundsError: attempt to access Int64 at index [2]
c42f commented 1 month ago

all do ... syntax already has a meaning

Good point. To make any progress on syntax issues like this and others, we probably need to adopt something like Rust Editions which allow module-local breaking syntax changes. They allow carefully considered breaking changes without bifurcating the ecosystem:

When creating editions, there is one most consequential rule: crates in one edition must seamlessly interoperate with those compiled with other editions.

JuliaSyntax already has a system for version-aware parsing. The main thing would be to add the edition to Project.toml (presumably) and we could do something like this on an opt-in basis.

MasonProtter commented 1 month ago

I like the idea of syntax editions, but IMO, that information should be in the file itself, not the Project.toml.

One thing I like a lot about Julia is that the meaning of code is almost always self contained in the code itself, rather than being modified in disconnected configuration files.

I don't want to share a code snippet saying

map(Iterators.product(1:2, 10:10:20)) do (x, y)
    x,y
end

and have people not know what it does without me also supplying the Project.toml.

Rather, I'd like something more like

using JuliaSyntax
JuliaSyntax.@set_feature do_parens=v2

map(Iterators.product(1:2, 10:10:20)) do (x, y)
    x,y
end

or something like that.

c42f commented 1 month ago

IMO it's important to have "all or nothing" for syntax editions which are designed to "improve" syntax in the sense of making it less confusing: there needs to be some incentive to drive the ecosystem forward so that everyone is using the latest syntax, where possible. See https://github.com/JuliaLang/julia/issues/54903#issuecomment-2237718876. So we shouldn't have fine grained options like JuliaSyntax.@set_feature do_parens=v2

Also that's somewhat problematic from a semantic standpoint: there might not be any module for @set_feature to act on (it would have to amount to "special syntax" recognized by the parser itself - more like a pragma than a macro) See also https://github.com/JuliaLang/julia/issues/54903 for reasons why Project.toml is a good place for this kind of thing.

o314 commented 3 weeks ago

Things may become strange IIUC, since parenthesis may be peeled differently if we use a do block (right side) or not (left side of the call). Here is another proposal

f(a, b) do x, y
    #= ... =#
end
# may be <=> to
f(a, b) do x, y ->      # recycle opener - leanified
    #= ... =#
end

# THEN

# type welcome
f(a, b) do x::Int, y ->
    #= ... =#
end

# kwarg welcome too
f(a, b) do x, y; u ->
    #= ... =#
end

# destructuring ok (same old form)
f(a, b) do (x, y); u ->
    #= ... =#
end

# w type params
f(a, b) do x::T, y::T; u ->
    #= ... =#
end where {T}

# !!! special point . rhs is listof ; but do is already new line sensitive
f(a, b) do x::Int, y ->
    #= ... =#
end
LilithHafner commented 3 weeks ago

From triage:

This makes sense from a theoretical/consistency perspective, but we don't think it's worth doing practically.

In a perfect world, this would work, but the do form is already really messed up and we don't want to make it more complicated and create even harder to understand forms.

Most of the time the type is used statically, typeof, eltype, ndims, etc. are better. When you really need the type variable, the workaround is trivial: you can always write the anonymous function directly.

The primary (though by no means exclusive) use of type variables in function declarations is for dispatch and folks almost never define multiple methods on a function declared with do syntax.

Ultimately, the practical cost of increased confusion likely outweighs the practical benefit.

PatrickHaecker commented 3 weeks ago

Thanks for the nuanced consideration and the well written summary.