nim-lang / RFCs

A repository for your Nim proposals.
136 stars 26 forks source link

Allow max/min to take a proc or `it` argument #439

Open ckp95 opened 2 years ago

ckp95 commented 2 years ago

Abstract

I propose to allow the max and min procs to take an additional argument, which specifies the criterion by which the comparisons are made. This could either be a single-argument procedure, or an it expression (similar to mapIt and friends).

Motivation

It is fairly common to want to get the maximum and minimum of a sequence of T, but by a different criterion than the usual < defined for T (or where < isn't even defined for T). Examples:

The algorithm module in the standard library already provides an analogous functionality for the sort family of functions: you can provide a cmp argument, or use sortedByIt. I suppose a third option could be to allow max and min to take a cmp function, although personally I find the cmp style to be fiddly in this context.

Other languages (e.g. Python) also allow you to pass a function to max and min.

Description

The max implemention in proc style:

proc max[T, U](x: openArray[T], op: proc(t: T): U): T =
  var currentMax = op(x[0])
  result = x[0]
  for i in 1..high(x):
    let criterion = op(x[i])
    if currentMax < criterion:
      currentMax = criterion
      result = x[i]

In it style:

# untested, adapted from sortedByIt implementation
func max[T](x: open_array[T], cmp: proc(x, y: T): int {.closure.}): T =
  result = x[0]
  for i in 1..high(x):
    if cmp(x[i], result) > 0:
      result = x[i]

template maxIt(x, op: untyped): untyped =
  var result = max(x, proc(x, y: typeof(items(x), typeOfIter)): int =
    var it {.inject.} = x
    let a = op
    it = y
    let b = op
    result = cmp(a, b))
  result

proc or it?

I'm not sure whether the criterion should be specified as a proc, or in it style, or whether to have both coexisting like is done with map and mapIt. Personally I prefer the proc style, but either or both are fine and have precedent. Thoughts?

Examples

Procedure way:

import sugar, sequtils

let raven = @["Once", "upon", "a", "midnight", "dreary"]
echo raven.max(x => x.len) # "midnight"
echo raven.min(x => x.len) # "a"

it way:

echo raven.maxIt(it.len) # "midnight"
echo raven.minIt(it.len) # "a"

Also the two-argument forms (i.e. not on an openArray) would allow for a criterion argument:

max(-10, 4, x => x.abs) # -10
min(-10, 4, x => x.abs) # 4

Other Things

While I'm on the subject, I thought of some other existing procs that might benefit from taking a criterion argument, and some new procs that could be added to sequtils on the same theme.

The maxIndex and minIndex procs in sequtils are the obvious next step (it versions omitted for brevity):

echo raven.maxIndex(x => x.len) # 3
echo raven.minIndex(x => x.len) # 2

(On an unrelated note, could we also add create argmax and argmin as aliases for maxIndex and minIndex? I thought the stdlib didn't have them for ages, until I noticed maxIndex and minIndex by accident while reading the sequtils docs.)

The find function could have variants which take a predicate argument:

echo raven.find(x => x.len < 4) # 2

It also seems useful to have first and last procs in sequtils:

echo raven.first # "Once"
echo raven.first(x => x.len > 4) # "midnight"
echo raven.last # "dreary"
echo raven.last(x => x.len == 4) # "upon"

(The return type should probably be Option[T] because the predicate might never be satisfied.)

Backward Incompatibility

This should not cause any backward-incompatible changes, since you would still be able to use max and min in the same way as before just by not passing in the criterion argument. These would be defined as additional versions of max and min, with different argument signatures, not modifying the existing definitions.

Araq commented 2 years ago

As long as these additions are not done to system.nim, I have no real objections. However, here is an idea: Instead of adding more and more things to sequtils and algorithm, let's have algorithms in their own modules:

sort.nim, map.nim, fold.nim, minmax.nim, ...

(Related to Nim v2.)

RSDuck commented 2 years ago

you can already do

type AbsInt = distinct int
proc `<=`(a, b: AbsInt): bool = abs(int(a)) <= abs(int(b))
echo int(min(AbsInt(-10), AbsInt(-12)))
MichalMarsalek commented 2 years ago

I like this very much. Recently while just playing around with something I did data.sortedByIt(criterion it)[0] (when I needed just data.minByIt(criterion it)) because I was lazy.

mratsim commented 2 years ago

maxIt shouldn't create a closure, it should inline everything, that's the main draw of mapIt/foldIt and friends.

Regarding Araq proposal I would classify like this:

yojiyama7 commented 1 year ago

I want this.

ckp95 commented 1 year ago

I forgot I made this RFC because I haven't used Nim for a while. But I see that Nim 2 is out now so I'll take another look at it and try to make a PR for these functions soon(tm).

you can already do

type AbsInt = distinct int
proc `<=`(a, b: AbsInt): bool = abs(int(a)) <= abs(int(b))
echo int(min(AbsInt(-10), AbsInt(-12)))

Yeah I guess, but having to define a new type and a < proc is a hassle. It's overkill when all I usually want is to make an ad-hoc / one-off criterion. Why do it in three lines when you could do it in one?