Closed flysand7 closed 3 years ago
Perhaps we can have sort(a.age>b.age && a.name>b.name)
or being able to provide a function to write how the sorting should take place?
This is wrong on several levels.
First, a.age>b.age && a.name>b.name
would swap two elements if they are sorted by age and not by name OR they are sorted by name and not by age. This is wrong because if I want to sort alphabetically and within same entries by age, then I get the following situation:
Casey Muratori, 29
Jon Blow, 29
Jon Blow, 40
In this example Casey Muratori, 29
and Jon Blow, 29
are sorted by name, but not by age, implying that the elements should be swapped -- but they shouldn't since this array is already sorted.
Second point, is that if you're proposing this as a syntax for sort
function, then I wouldn't hold you back but consider what happens when people want to use this non-trivial syntax for other purposes. I'm not sure whether this is a good idea.
I propose an alternative solution, which is more flexible. Make sort
take in anonymous templatised functions of a
and b
. In this case it is possible to implement what I said as well as many other non-trivial sorting functions (ex.: alphanumeric sorting of strings (example implementation and details: https://gist.github.com/bumbread/5b0d06171fd7a9cf42f68379b4dd4c97)).
The solution will look like this:
arr.sort(fn(a,b: User)->bool {
// The users are definately sorted if names are
if(a.name != b.name) {
return a.name < b.name;
}
// If the names equal they are sorted if the ages are
return a.age < b.age;
});
You are right, that syntax (&&) would never work. Being able to provide a custom sorting function would be an elegant solution.
There is already a function to sort arrays based on a custom comparator, seen in vlib/builtin/array.v
. Here is an example:
struct User {
pub mut:
age int
typ int
}
fn main() {
mut arr := [
User{10, 1},
User{9, 1},
User{83, 0},
User{99, 0}
]
comparator := fn (a &User, b &User) int {
cond := a.age >= b.age && a.typ <= b.typ
if cond == true { return -1 }
else { return 1 }
}
arr.sort_with_compare(comparator)
println(arr)
}
The comparator I wrote sorts user by age, type. I wanted the lower types to show up first (so when you run this program, users with type 0 will show up first) and then by age, so that the final order is highest age/lowest type.
Here is the output:
[User{
age: 99
typ: 0
}, User{
age: 83
typ: 0
}, User{
age: 10
typ: 1
}, User{
age: 9
typ: 1
}]
Edit: I'm not sure what sort_with_compare expects its comparator to return, apparently -1 when a
comes before b
and 1 when b
comes before a
.
Also, make sure the args to your comparator function are references.
Found something useful. V uses C's stdlib qsort to perform sorting using this custom comparator function from my last comment. Here's some piece of documentation: https://www.cplusplus.com/reference/cstdlib/qsort/
It says: | Return value | Meaning |
---|---|---|
<0 | a comes before b | |
0 | a and b are the same (do not swap them) | |
>0 | b comes before a |
I missed the sort_with_compare function, that was exactly what I was looking for. I will check the docs to see if I missed it, otherwise will update it to make it easier to find.
A little bit off-topic: I actually think that we could still benefit from a stable sorting algorithm if it was the default (and at the same time get one C lib dependency less).
E.g. https://github.com/scandum/quadsort having the same measured performance (or slightly better) as qsort in all potentially problematic cases. It has identical algorithmic properties as quicksort with the only difference that it's stable.
There is also a fixed-size-memory variant https://github.com/scandum/blitsort but it gets slightly slower for large inputs (beginning with tens of thousands of items with default configuration) as a trade off for fixed size memory (quicksort doesn't use fixed size memory but more).
Related existing issue: https://github.com/vlang/v/issues/5201
A little bit off-topic: I actually think that we could still benefit from a stable sorting algorithm if it was the default (and at the same time get one C lib dependency less).
E.g. https://github.com/scandum/quadsort having the same measured performance (or slightly better) as qsort in all potentially problematic cases. It has identical algorithmic properties as quicksort with the only difference that it's stable.
There is also a fixed-size-memory variant https://github.com/scandum/blitsort but it gets slightly slower for large inputs (beginning with tens of thousands of items with default configuration) as a trade off for fixed size memory (quicksort doesn't use fixed size memory but more).
Related existing issue: #5201
@dumblob it's better to let the user choose which algorithm to use:
array.sort(a > b, QUICKSORT)
array.sort(a > b, MERGESORT)
Julia does it: https://docs.julialang.org/en/v1/base/sort/
@dumblob it's better to let the user choose which algorithm to use:
Definitely! Though I'm not concerned with "choice" but with default behavior (I don't think I'd want to write array.sort( a > b, QUICKSORT )
instead of array.sort( a > b )
).
Currently
array.sort()
is unstable, meaning it is unusable for performing this kind of task:A code that would perform this would have been:
However you can observe that if the second sort changes the relative positions of elements with the same age (i.e. the sorting algorithm is unstable), then the task would not be completed using this algorithm.
Current
sort
behaviourI'm trying to sort an array like this:
And the command outputs users with similar age in the wrong order: