JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.88k stars 5.49k forks source link

Alternative syntax for `map(func, x)` #8450

Closed kmsquire closed 8 years ago

kmsquire commented 10 years ago

This was discussed in some detail here. I was having trouble finding it, and thought it deserved its own issue.

quinnj commented 10 years ago

+1

toivoh commented 10 years ago

Or func.(args...) as syntactic sugar for

broadcast(func, args...)

But maybe I'm the only one who would prefer that? Either way, +1.

ihnorton commented 10 years ago

:-1: If anything, I think Stefan's other suggestion of f[...] has a nice similarity to comprehensions.

johnmyleswhite commented 10 years ago

Like @ihnorton, I'm also not super fond of this idea. In particular, I dislike the asymmetry of having both a .+ b and sin.(a).

quinnj commented 10 years ago

Maybe we don't need special syntax. With JuliaLang/julia#1470, we could do something like

call(f::Callable,x::AbstractArray) = applicable(f,x) ? apply(f,x) : map(f,x)

right? Perhaps this would be too magical though, to get auto-map on any function.

JeffBezanson commented 10 years ago

@quinnj That one line summarizes my greatest fears about allowing call overloading. I won't be able to sleep for days.

JeffBezanson commented 10 years ago

Not yet sure it's syntactically possible, but what about .sin(x)? Is that more similar to a .+ b?

I think [] is getting way too overloaded and will not work for this purpose. For example, we'll probably be able to write Int(x), but Int[x] constructs an array and so cannot mean map.

johnmyleswhite commented 10 years ago

I would be onboard with .sin(x).

StefanKarpinski commented 10 years ago

We'd have to claw back some syntax for that, but if Int(x) is the scalar version then Int[x] is reasonable by analogy to construct a vector of element type Int. In my mind the opportunity to make that syntax more coherent is actually one of the most appealing aspects of the f[v] proposal.

JeffBezanson commented 10 years ago

How does making f[v] syntax for map make the syntax more coherent? I don't understand. map has a different "shape" than the current T[...] array constructor syntax. What about Vector{Int}[...]? Wouldn't that not work?

quinnj commented 10 years ago

lol, sorry for the scare @JeffBezanson! Haha, the call overloading is definitely a little scary, every once in a while, I think about the kinds of code obfuscation you can do in julia and with call, you could do some gnarly stuff.

I think .sin(x) sounds like a good idea too. Was there consensus on what to do with multi-args?

jakebolewski commented 10 years ago

:-1:. Saving a couple of characters compared to using higher order functions I don't think is worth the cost in readability. Can you imagine a file with .func() / func.() and func() interspersed everywhere?

JeffBezanson commented 10 years ago

It seems likely we'll remove the a.(b) syntax anyway, at least.

kmsquire commented 10 years ago

Wow, talk about stirring up a bees nest! I changed the name to better reflect the discussion.

JeffBezanson commented 10 years ago

We could also rename 2-argument map to zipWith :)

ihnorton commented 10 years ago

If some syntax is really necessary, how about [f <- b] or another pun on comprehensions inside the brackets?

(@JeffBezanson you're just afraid that someone is going to write CJOS or Moose.jl :) ... if we get that feature, just put it in the Don't do stupid stuff: I won't optimize that section of the manual)

StefanKarpinski commented 10 years ago

Currently writing Int[...] indicates that you are constructing an array of element type Int. But if Int(x) means converting x to Int by applying Int as a function then you could also consider Int[...] to mean "apply Int to each thing in ...", oh which by the way happens to produce values of type Int. So writing Int[v] would be equivalent to [ Int(x) for x in v ] and Int[ f(x) for x in v ] would be equivalent to [ Int(f(x)) for x in v ]. Of course, then you've lost some of the utility of writing Int[ f(x) for x in v ] in the first place – i.e. that we can statically know that the element type is Int – but if enforce that Int(x) must produce a value of type Int (not an unreasonable constraint), then we could recover that property.

JeffBezanson commented 10 years ago

Strikes me as more vectorization/implicit-cat hell. What would Int[x, y] do? Or worse, Vector{Int}[x]?

StefanKarpinski commented 10 years ago

I'm not saying it's the best idea ever or even advocating for it – I'm just pointing out that it doesn't completely clash with the existing usage, which is itself a bit of a hack. If we could make the existing usage part of a more coherent pattern, that would be a win. I'm not sure what f[v,w] would mean – the obvious choices are [ f(x,y) for x in v, y in w ] or map(f,v,w) but there are still more choices.

jakebolewski commented 10 years ago

I feel that a.(b) is hardly used. Ran a quick test and it is used in only in 54 of the ~4,000 julia source files in the wild: https://gist.github.com/jakebolewski/104458397f2e97a3d57d.

JeffBezanson commented 10 years ago

I think it does completely clash. T[x] has the "shape" T --> Array{T}, while map has the shape Array{T} --> Array{S}. Those are pretty much incompatible.

To do this I think we'd have to give up T[x,y,z] as a constructor for Vector{T}. Plain old array indexing, A[I] where I is a vector, can be seen as map(i->A[i], I). "Applying" an array is like applying a function (of course matlab even uses the same syntax for them). In that sense the syntax really works, but we would lose typed-vector syntax in the process.

johnmyleswhite commented 10 years ago

I kind of feel like debating syntax here distracts from the more important change: making map fast.

JeffBezanson commented 10 years ago

Obviously making map fast (which, by the way, needs to be part of a fairly thorough redesign of the notion of functions in julia) is more important. However going from sin(x) to map(sin, x) is very significant from a usability perspective, so to really kill vectorization the syntax is quite important.

nalimilan commented 10 years ago

However going from sin(x) to map(sin, x) is very significant from a usability perspective, so to really kill vectorization the syntax is quite important.

Fully agreed.

toivoh commented 10 years ago

I agree with @JeffBezanson that f[x] is pretty much irreconcilable with the current typed array constructions Int[x, y] etc.

toivoh commented 10 years ago

Another reason to prefer .sin over sin. is to finally allow to use e.g. Base.(+) to access the + function in Base (once a.(b) is removed).

kmsquire commented 10 years ago

When a module defines its own sin (or whatever) function and we want to use that function on a vector, do we do Module..sin(v)? Module.(.sin(v))? Module.(.sin)(v)? .Module.sin(v)?

StefanKarpinski commented 10 years ago

None of these options really seem good anymore.

binarybana commented 10 years ago

I feel like this discussion misses the meat of the problem. Ie: when mapping single argument functions to containers, I feel like the map(func, container) syntax is already clear and succinct. Instead, it is only when dealing with multiple arguments that I feel we might could benefit from better syntax for currying.

Take for example the verbosity of map(x->func(x,other,args), container), or chain a filter operation to make it worse filter(x->func2(x[1]) == val, map(x->func1(x,other,args), container)).

In these cases, I feel a shortened map syntax would not help much. Not that I think these are particularly bad, but a) I don't think a short-hand map would help much and b) I love pining after some of Haskell's syntax. ;)

IIRC, in Haskell the above can be written filter ((==val) . func2 . fst) $ map (func1 other args) container with a slight change in the order of arguments to func1.

rfourquet commented 10 years ago

In elm .func is defined by x->x.func and this is very useful, see elm records. This should be considered before taking this syntax for map.

StefanKarpinski commented 10 years ago

I like that.

StefanKarpinski commented 10 years ago

Although field access isn't such a big deal in Julia as in many languages.

rfourquet commented 10 years ago

Yes it feels less relevant here as fields in Julia are kind of more for "private" use. But with the ongoing discussion about overloading field access, that may become more sensible.

nalimilan commented 10 years ago

f.(x) looks like the less problematic solution, if it wasn't for the asymmetry with .+. But keeping the symbolic association of . to "element-wise operation" is a good idea IMHO.

rfourquet commented 10 years ago

If current typed array construction can be deprecated, then func[v...] can be translated to map(func, v...), and the litteral arrays can then be written T[[a1, ..., an]] (instead of current T[a1, ..., an]).

rfourquet commented 10 years ago

I find sin∘vquiet natural also (when an array v is seen a an application from indexes to contained values), or more simply sin*v or v*[sin]' (which requires defining *(x, f::Callable)) etc.

nalimilan commented 10 years ago

Coming back to this issue with a fresh mind, I realized f.(x) can be seen as a quite natural syntax. Instead of reading it as f. and (, you can read it as f and .(. .( is then metaphorically an element-wise version of the ( function call operator, which is fully consistent with .+ and friends.

johnmyleswhite commented 10 years ago

The idea of .( being a function call operator makes me very sad.

nalimilan commented 10 years ago

@johnmyleswhite Care to elaborate? I was speaking about the intuitiveness of the syntax, or it's visual consistency with the rest of the language, not about the technical implementation at all.

johnmyleswhite commented 10 years ago

To me, ( isn't part of the language's semantics at all: it's just part of the syntax. So I wouldn't want to have to invent a way for .( and ( to start differing. Does the former generate a multicall Expr instead of a call Expr?

nalimilan commented 10 years ago

No. As I said I wasn't implying at all there should be two different call operators. Just trying to find a visually consistent syntax for element-wise operations.

StefanKarpinski commented 10 years ago

To me what kills these options is the question of how to vectorize multi-argument functions. There's no single way to do it and anything that's general enough to support every possible way starts to look a lot like the multidimensional array comprehensions that we already have.

JeffBezanson commented 10 years ago

It's quite standard for multi-argument map to iterate over all arguments. If we did this I would make .( syntax for a call to map. That syntax might not be so great for various reasons, but I'd be fine with these aspects.

nalimilan commented 10 years ago

The fact that several generalizations are possible for multi-argument functions cannot be an argument against supporting at least some special cases -- just like matrix transpose is useful even if it can be generalized in several ways for tensors.

We just need to choose the most useful solution. Possible choices have already been discussed here: https://github.com/JuliaLang/julia/issues/8389#issuecomment-55953120 (and following comments). As @JeffBezanson said the current behavior of map is reasonable. An interesting criterion is to be able to replace @vectorize_2arg.

johnmyleswhite commented 10 years ago

My point is that having sin.(x) and x .+ y coexist is awkward. I'd rather have .sin(x) -> map(sin, x) and x .+ y -> map(+, x, y).

JeffBezanson commented 10 years ago

.+ actually uses broadcast.

Some other ideas, out of pure desperation:

  1. Overload colon, sin:x. Doesn't generalize well to multiple arguments.
  2. sin.[x] --- this syntax is available, currently meaningless.
  3. sin@x --- not as available, but maybe possible
StefanKarpinski commented 10 years ago

I'm really not convinced that we need this.

JeffBezanson commented 10 years ago

Me neither. I think f.(x) is kind of the best option here, but I don't love it.

nalimilan commented 10 years ago

But without this how can we avoid creating all sorts of vectorized functions, in particular things like int()? This is what prompted me to start this discussion in https://github.com/JuliaLang/julia/issues/8389.

johnmyleswhite commented 10 years ago

We should encourage people to use map(func, x). It's not that much typing and it's immediately clear to anyone coming from another language.