Open timholy opened 1 year ago
IMO, the correct 2.0 solution is getting rid of all the other ones like sum(f, x)
which only exist because we don't currently have a good way of doing reductions generically.
AFAICT it's just a specialization bottleneck. We could fix that today, but the cost is longer compile times.
But that wouldn't change the fact that argmax
means two different things. Are you saying that if we got rid of the others, then the disconnect isn't so bad? It's certainly worse because of the analogy to other higher-order functions, but I'm not sure it's great even if we do follow your proposal.
In my opinion, the methods like sum(f, x)
and maximum(f, x)
are unnecessary. They are using valuable methods just as syntactical sugar for map+sum and map+maximum, respectively. As I see it, the mapping operation is orthogonal, and need not be extra methods of all these higher order functions.
@jakobnissen, see https://github.com/JuliaLang/julia/issues/48502#issuecomment-1414227664
Worth linking the previous related discussions?
Also https://github.com/JuliaLang/julia/pull/41339 with some summarized discussion of the previous discussions
Given that the point made by @oscardssmith and @jakobnissen is natural but misses the main point, maybe it's worth thinking about this way: how would you express the 1-arg method in terms of the 2-arg method? The answer is
argmax(itr) = argmax(idx -> itr[idx], CartesianIndices(itr))
Note that the itr
moves into the mapping function and only barely contributes to the collection being iterated over. To me this is a clear sign that these are actually two different functions.
Given that the point made by @oscardssmith and @jakobnissen is natural but misses the main point, maybe it's worth thinking about this way: how would you express the 1-arg method in terms of the 2-arg method? The answer is
argmax(itr) = argmax(idx -> itr[idx], CartesianIndices(itr))
Note that the
itr
moves into the mapping function and only barely contributes to the collection being iterated over. To me this is a clear sign that these are actually two different functions.
I suggested something similar, though slightly less general, in one of those posts.
That's totally a footgun!
I also tried to convince others not to introduce argmax(f, A)
but failed. See a long discussion at https://discourse.julialang.org/t/findmax-and-friends-confusing-behaviour-to-be-introduced-in-1-7/61904.
There's direct empirical evidence that it's a footgun: even popular packages sometimes implement argmax wrong, even after almost two years of argmax(f, A)
- see the end of that discourse thread.
If starting from scratch: an intuitive, unambiguous, and consistent interface could be indmax([f], X)
/keymax
to return the index of the maximum, and maximum(A, [by=f])
to return the maximizing element. Leaving argmax
to specialized optimization/minimization packages.
This has been discussed many times. I would summarize as
argmax
is defined as an operation on a function, not a collection/set, and argmax(f, s)
matches the mathematical definition, so it should staysum(f, s)
etc. are non-standard and are hacks, those were mistakesindmax
seems like a good compromise; there is no particular downside except "too many functions"Agreed that argmax(f, s)
is the better definition, even though it was added later.
I'm really surprised to see people complain about sum(f, itr)
. I think that functions like that are fantastic because new users typically don't understand what the hell map
, reduce
, foldl
and filter
are and it takes a surprisingly long time to wrap ones head around them. Supplying sum(f, itr)
is a really nice stepping stone IMO.
But even as an experienced user I find those functions really convenient and natural.
@MasonProtter If Julia had concise reduction syntax, it could be easier to use even than sum(f, itr)
. But I don't want to derail the thread - happy to continue elsewhere.
argmax
has two meanings:argmax(indexable)
returns the indexi
for whichindexable[i]
is maximumargmax(f, itr)
returns thex∈itr
which achieves the maximum value off(x)
Both of these are useful and well-defined operations. But they should be considered with respect to Julia's overall consistency: it's worth asking how
argmax(f, itr)
compares to other such functions that accept a "mapping" functionf
as an input. For example,sum(f, itr)
computessum(f(x) for x in itr)
,maximum(f, itr)
computesmaximum(f(x) for x in itr)
, and so on. One simple consequence is that these functions have a very natural property:sum(identity, itr) == sum(itr)
.So, is
argmax(identity, a) == argmax(a)
?Nope. How about "is
argmax(f, itr)
equivalent toargmax(f(x) for x in itr)
"?Again nope: if it did, it would return the index
i
such thata[i] == maximum(f, a)
.Personally, I think this is a footgun. If we ever decide to go to Julia 2.0, I think we should reconsider whether these two methods belong to the same function.
One potential solution is that we could have
argmax(itr)
become synonymous withargmax(identity, itr)
and embrace the newer meaning ofargmax
. Then, noting thatfindmax(f, a)
does act in a manner consistent with other mapping functions, we could just eliminate the original meaning ofargmax
, asargmax
seems to sometimes be used as a substitute forfindmax(A)[2]
. But I'm less fond of having to remember whether the index is the first or second output; hence we could switch to having it return aNamedTuple
(we might be able to do that even now).