Morwenn / cpp-sort

Sorting algorithms & related tools for C++14
MIT License
617 stars 57 forks source link

Timsort: incorrect minRunLength() computing minRun to be too small #172

Closed CovetingEpiphany2152 closed 3 years ago

CovetingEpiphany2152 commented 3 years ago

https://github.com/Morwenn/cpp-sort/blob/32b874fb025a6e8f86f1db568cd38682cb9f62d9/include/cpp-sort/detail/timsort.h#L192

This line has the same issue as reported in your Timsort repo here.

Morwenn commented 3 years ago

Yeah, I planned to change it here too. I just haven't had the time to take care of it yet (I'm running the tests and benchmarks to check whether they can find any difference).

CovetingEpiphany2152 commented 3 years ago

Thanks. I would expect more insertion sorts being done to hit the minRunLength that is now usually longer (for cases with short runs like random data). As such, worst and random cases should have improvements. Not sure how significant it would be though.

Morwenn commented 3 years ago

My benchmarks didn't show much improvement. I wouldn't be surprised if it made the a bit slower because most of the library has a limit around 32 for insertion sort which came from benchmarking, but the benchmarks almost exclusively use scalar types, which is not representative of the Python-specific tradeoffs timsort was originally designed for.