godotengine / godot

Godot Engine – Multi-platform 2D and 3D game engine
https://godotengine.org
MIT License
89.44k stars 20.25k forks source link

Sorting Variant arrays is not stable #58878

Open miv391 opened 2 years ago

miv391 commented 2 years ago

Godot version

v4.0.alpha3.official [256069eaf]

System information

Windows 10

Issue description

Documentation does not forbid using sort method with Variant arrays. Variant arrays can be sorted, but the results are random and no error messages are shown.

As the Variants cannot be sorted in a meaningful way, I think this situation should be considered as an error and the array should not be changed at all.

I did not found out why sometimes Variant array is changed and sometimes it is not.

Output of test code: kuva No errors: kuva

Steps to reproduce

func test():
    # 1) array is changed, but not really sorted
    var arr1 = ["5", 4, 7, "3", 9]
    print(arr1)
    arr1.sort()
    print(arr1)

    # 2) array is not changed at all
    var arr2 = [{"foo": 5, "bar": 2}, 6, {"foo": 4, "bar": 2}, "hello"]
    print(arr2)
    arr2.sort()
    print(arr2)

Minimal reproduction project

No response

aaronfranke commented 2 years ago

For the first one, it appears to be placing all the strings first, then sorting the rest. So it is sorting, type-major.

For the second one, it's strange that the dictionaries aren't next to each other, so indeed it doesn't look like it's doing any sorting.

miv391 commented 2 years ago

For the first one, it appears to be placing all the strings first, then sorting the rest. So it is sorting, type-major.

It just appears to sort by type with that test data. Changing the values of the strings the behaviour changes.

    var arr3 = ["111", 4, 7, "112", 9]
    print(arr3)
    arr3.sort()
    print(arr3)

Now the array is not changed at all:

[111, 4, 7, 112, 9]
[111, 4, 7, 112, 9]
Sauermann commented 2 years ago

It looks like they are sorted according to their internal C++ pointer, which is not intuitive for users. https://github.com/godotengine/godot/blob/f488a841c7225a4d80abbc04c74079fbbac01c77/core/variant/variant_op.h#L607-L615

What about the following sort order:

  1. Variants of different types, that provide no comparison operator, are grouped according to the Variant::Type enum
  2. Variants of types, that provide a comparison operator, are sorted according to their < operator (this could get tricky, if different Variants, that have a comparison operator, are not neighbors in Variant::Type)

Or

  1. Variants of different types are grouped according to the Variant::Type enum
  2. Variants of the same type are sorted according to their < operator
miv391 commented 2 years ago

In my opinion there is no need for complicated Variant sorting algorithm. If you really want to sort an array with multiple kinds of data, it is extremely unlikely that the default sorting algorithm would be just right for your needs. My guess is that almost all cases where a programmer tries to sort a Variant array are mistakes and should give an error. For the same reason a Variant array should not be changed at all, because then the programmer might wrongly assume that sorting actually works.

If the programmer really wants to sort a Variant array using her own algorithm, sort_custom is the right tool for that.

novaplusplus commented 2 years ago

My guess is that almost all cases where a programmer tries to sort a Variant array are mistakes and should give an error.

Agreed. IMO this is definitely a case where one should be more explicit in the intent of the code, because of the potential for confusion on both the part of the programmer and engine is high in this situation

goatchurchprime commented 2 years ago

Don't forget this long running issue (which has just caught me out badly today again) https://github.com/godotengine/godot/issues/39299 That is about sorting arrays of arrays of numbers, which are compatible variants.

dalexeev commented 1 year ago

I ran into this issue while working on #78742. This is a problem in particular when using binary search on Variant arrays. The reason is here:

https://github.com/godotengine/godot/blob/cdd2313ba27d0a2600a18e849b4c5d1fd6a6e351/core/variant/array.cpp#L610-L613

https://github.com/godotengine/godot/blob/cdd2313ba27d0a2600a18e849b4c5d1fd6a6e351/core/variant/array.cpp#L635-L640

https://github.com/godotengine/godot/blob/cdd2313ba27d0a2600a18e849b4c5d1fd6a6e351/core/variant/array.cpp#L598-L608

In case two variants are incomparable using Variant::OP_LESS, we can compare their type:

struct _ArrayVariantSort {
    _FORCE_INLINE_ bool operator()(const Variant &p_l, const Variant &p_r) const {
        bool valid = false;
        Variant res;
        Variant::evaluate(Variant::OP_LESS, p_l, p_r, res, valid);
        if (!valid) {
            if (p_l.get_type() != p_r.get_type()) {
                return p_l.get_type() < p_r.get_type();
            }
            ERR_PRINT(vformat("Failed to compare %s and %s.", p_l, p_r));
            res = false;
        }
        return res;
    }
};

But there are cases where variants have the same type but are incomparable, such as objects. Can we sort them by hash?

LinqLover commented 1 year ago

Same for an array of only StringNames.

LiterallyAClown commented 2 weeks ago

In my opinion there is no need for complicated Variant sorting algorithm. If you really want to sort an array with multiple kinds of data, it is extremely unlikely that the default sorting algorithm would be just right for your needs. My guess is that almost all cases where a programmer tries to sort a Variant array are mistakes and should give an error. For the same reason a Variant array should not be changed at all, because then the programmer might wrongly assume that sorting actually works.

If the programmer really wants to sort a Variant array using her own algorithm, sort_custom is the right tool for that.

It's not a matter of having the "right" sorting order, it's a matter of having a deterministic sorting order. I just got bitten by this where I wanted to sort the keys from a dictionary to process them in a deterministic order (no matter which one) and sort gave different results every time, and I had to use sort_custom with a predicate that just returned a < b for it to work.

It's extremely counter-intuitive that sort() doesn't sort things in any deterministic order in some cases when all the values in the array are distinct.

AThousandShips commented 2 weeks ago

Fair enough but how would you accomplish that with these cases? How do you deterministically sort something that doesn't have anything deterministic to sort by? Like a raw Object doesn't have anything deterministic to sort by for individual instances.

LiterallyAClown commented 2 weeks ago

Same for an array of only StringNames.

It turns out that StringNames are compared by in memory addresses, so they can't be sorted deterministically.

AThousandShips commented 2 weeks ago

That's massively more efficient, so offering an option to sort differently would be the way to go IMO, otherwise it's a significant performance drain

LiterallyAClown commented 2 weeks ago

Fair enough but how would you accomplish that with these cases? How do you deterministically sort something that doesn't have anything deterministic to sort by? Like a raw Object doesn't have anything deterministic to sort by for individual instances.

I don't know, I'm not really used to non typed languages. Perhaps it's just one more of the many pitfalls that non typed languages create in the guise of being "easy to use".

AThousandShips commented 2 weeks ago

That's not unique to this, it's universal with cases that don't have that kind of property, it's not a choice, it's a fundamental limit, how would a typed language solve this? You would need to provide something to sort by

Now the problem of sorting different types is a separate thing, that's important to navigate and fix, but the determinism is essentially unavoidable without adding code to provide specific custom sorting