Raku / problem-solving

🦋 Problem Solving, a repo for handling problems that require review, deliberation and possibly debate
Artistic License 2.0
70 stars 16 forks source link

Provide `sort` with `:k` adverb (returns ordinal index) #369

Closed jubilatious1 closed 1 year ago

jubilatious1 commented 1 year ago

Similar to issue posted by @0rir (and comment by @0racle):

Provide sort with :k adverb (returns ordinal index)

Sometimes you don't need the values of a data structure returned in correct sort order--sometimes you just need the correct order (ordinal index) of those elements.

In the R-programming language, you get two different functions to accomplish these tasks, sort and order:

https://web.mit.edu/~r/current/lib/R/library/base/html/sort.html

https://web.mit.edu/~r/current/lib/R/library/base/html/order.html

Thx.


Putting my hand up as someone who has wanted min/max with :k.

The adverbs named args should be similar to first, ie. :v just returns the value (default), :k returns the index, :kv returns a Seq of the index and value, and :p returns a Pair of the index and value.

Originally posted by @0racle in https://github.com/Raku/problem-solving/issues/367#issuecomment-1499720683

jubilatious1 commented 1 year ago

Sampling/ordering code from the R-programming language, within the R-REPL:

> samp1 <- sample(LETTERS[1:10], 10)
> samp1
 [1] "C" "B" "G" "D" "A" "J" "I" "H" "E" "F"
> sort(samp1)
 [1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J"
> order(samp1)
 [1]  5  2  1  4  9 10  3  8  7  6
> samp1[order(samp1)]
 [1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J"
> 

Above, taking the order() of the samp1 vector with the code order(samp1):


Meaning you can reorder the samp1 vector ( or any other vector of identical length) based upon the index-output of order(samp1) as above, and the identity below holds:

> sort(samp1) == samp1[order(samp1)]
 [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
> all(sort(samp1) == samp1[order(samp1)])
[1] TRUE

https://web.mit.edu/~r/current/lib/R/library/base/html/sort.html

https://web.mit.edu/~r/current/lib/R/library/base/html/order.html

0racle commented 1 year ago

Since I was quoted, I also think this feature would be useful. Notably, APL - which Larry was influenced by - does not have a Sort primitive, but it does have a Grade primitive that does exactly this

      ⍋ 3 4 6 1 9 10 8 7 2 5
4 9 1 2 10 3 8 7 5 6

So in APL (and it's decedents) Grade is seen as a more fundamental primitive, from which you can derive Sort. All this is to say that this concept of Grade (or Order) is useful in other algorithms (not just sort), and while it's possible (*.pairs.sort(*.value)».key is one way), having a simple :k option would be nice.

Of course, unlike APL and R, Raku's indices would be 0-based, such that @xs.sort ≡ @xs[@xs.sort(:k)].

lizmat commented 1 year ago

FWIW, if a sort is done with a Callable with a single positional argument, the list with indices basically already exists internally as part of the Schwartzian transform. So it looks to me that support for :k, :kv, :p could be easily derived by forcing a Schwarrtzian transform, and post-processing the indices list.

jubilatious1 commented 1 year ago

https://unix.stackexchange.com/a/752119/227738

jubilatious1 commented 1 year ago

Reposting this comment lower down in the thread (deleting above). New (edited) comment above is (hopefully) more readable.

Sampling/ordering code from the R-programming language, within the R-REPL:

> samp1 <- sample(1:10, 10)
> samp1
 [1]  3  4  6  1  9 10  8  7  2  5
> order(samp1)
 [1]  4  9  1  2 10  3  8  7  5  6
> 

Above, taking the order() of the samp1 vector with the code order(samp1):

  • Returns 4 in the first position, telling you that the lowest numeric value in samp1 is in one-indexed position 4 of the original samp1 vector. (In this case, that value equals 1).

  • Returns 9 in the second position, telling you that the second-lowest numeric value in samp1 is in one-indexed position 9 of the original samp1 vector. (In this case, that value equals 2).

  • Returns 1 in the third position, telling you that the third-lowest numeric value in samp1 is in one-indexed position 1 of the original samp1 vector. (In this case, that value equals 3).

  • Returns 2 in the fourth position, telling you that the fourth-lowest numeric value in samp1 is in one-indexed position 2 of the original samp1 vector. (In this case, that value equals 4).

  • Returns 10 in the fifth position, telling you that the fifth-lowest numeric value in samp1 is in one-indexed position 10 of the original samp1 vector. (In this case, that value equals 5).

  • Returns 3 in the sixth position, telling you that the sixth-lowest numeric value in samp1 is in one-indexed position 3 of the original samp1 vector. (In this case, that value equals 6).

  • Returns 8 in the seventh position, telling you that the seventh-lowest numeric value in samp1 is in one-indexed position 8 of the original samp1 vector. (In this case, that value equals 7).

  • Returns 7 in the eighth position, telling you that the eighth-lowest numeric value in samp1 is in one-indexed position 7 of the original samp1 vector. (In this case, that value equals 8).

  • Returns 5 in the ninth position, telling you that the ninth-lowest numeric value in samp1 is in one-indexed position 5 of the original samp1 vector. (In this case, that value equals 9).

  • Returns 6 in the tenth position, telling you that the tenth-lowest numeric value in samp1 is in one-indexed position 6 of the original samp1 vector. (In this case, that value equals 10).


Meaning you can reorder the samp1 vector ( or any other vector of identical length) based upon the index-output of order(samp1):

> samp1[order(samp1)]
 [1]  1  2  3  4  5  6  7  8  9 10
> 

https://web.mit.edu/~r/current/lib/R/library/base/html/sort.html

https://web.mit.edu/~r/current/lib/R/library/base/html/order.html

lizmat commented 1 year ago

https://github.com/rakudo/rakudo/commit/53edf4e425 implements List.sort(:k)