intel / x86-simd-sort

C++ template library for high performance SIMD based sorting algorithms
BSD 3-Clause "New" or "Revised" License
855 stars 57 forks source link

Use uint32_t instead of size_t for object sort #126

Closed r-devulap closed 9 months ago

r-devulap commented 9 months ago

Improves performance especially when the metric to sort is a 32-bit dtype:

Benchmark                                                                            Time             CPU      Time Old      Time New       CPU Old       CPU New
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
[scalarobjsort.*float vs. simdobjsort.*float],x>>/100                             +0.0933         +0.0947          1419          1551          1421          1555
[scalarobjsort.*float vs. simdobjsort.*float],x>>/1000                            -0.3962         -0.3965         15418          9309         15431          9313
[scalarobjsort.*float vs. simdobjsort.*float],x>>/10000                           -0.8120         -0.8119        576470        108403        576505        108422
[scalarobjsort.*float vs. simdobjsort.*float],x>>/100000                          -0.7364         -0.7364       7084507       1867259       7084966       1867333
[scalarobjsort.*float vs. simdobjsort.*float],x>>/1000000                         -0.5463         -0.5463      86243039      39130817      86238537      39128228
[scalarobjsort.*float vs. simdobjsort.*float],x>>/10000000                        +0.0628         +0.0628    1018919772    1082899576    1018866896    1082838013
[scalarobjsort.*float vs. simdobjsort.*float],taxicab>>/100                       -0.2089         -0.2088          2028          1604          2031          1607
[scalarobjsort.*float vs. simdobjsort.*float],taxicab>>/1000                      -0.6529         -0.6529         28076          9744         28085          9749
[scalarobjsort.*float vs. simdobjsort.*float],taxicab>>/10000                     -0.8666         -0.8666        852263        113676        852319        113696
[scalarobjsort.*float vs. simdobjsort.*float],taxicab>>/100000                    -0.8211         -0.8211      10474638       1873979      10474826       1873929
[scalarobjsort.*float vs. simdobjsort.*float],taxicab>>/1000000                   -0.6921         -0.6921     126190844      38854176     126185333      38851333
[scalarobjsort.*float vs. simdobjsort.*float],taxicab>>/10000000                  -0.2535         -0.2535    1460305765    1090189561    1460270229    1090125905
[scalarobjsort.*float vs. simdobjsort.*float],euclidean>>/100                     -0.3014         -0.3004          2273          1588          2275          1591
[scalarobjsort.*float vs. simdobjsort.*float],euclidean>>/1000                    -0.7477         -0.7476         36658          9247         36672          9254
[scalarobjsort.*float vs. simdobjsort.*float],euclidean>>/10000                   -0.8951         -0.8951       1073122        112611       1073198        112625
[scalarobjsort.*float vs. simdobjsort.*float],euclidean>>/100000                  -0.8582         -0.8581      13320127       1889451      13320036       1889454
[scalarobjsort.*float vs. simdobjsort.*float],euclidean>>/1000000                 -0.7599         -0.7599     162466510      39011741     162464089      39008496
[scalarobjsort.*float vs. simdobjsort.*float],euclidean>>/10000000                -0.4183         -0.4183    1865672145    1085333325    1865605675    1085226497
[scalarobjsort.*float vs. simdobjsort.*float],chebyshev>>/100                     -0.1406         -0.1400          1822          1566          1825          1570
[scalarobjsort.*float vs. simdobjsort.*float],chebyshev>>/1000                    -0.6403         -0.6403         25983          9346         25999          9352
[scalarobjsort.*float vs. simdobjsort.*float],chebyshev>>/10000                   -0.8596         -0.8596        808734        113518        808769        113526
[scalarobjsort.*float vs. simdobjsort.*float],chebyshev>>/100000                  -0.8172         -0.8172      10281025       1879824      10281239       1879870
[scalarobjsort.*float vs. simdobjsort.*float],chebyshev>>/1000000                 -0.6841         -0.6841     122766074      38785146     122764627      38783435
[scalarobjsort.*float vs. simdobjsort.*float],chebyshev>>/10000000                -0.2506         -0.2507    1441875046    1080508739    1441843964    1080441164
OVERALL_GEOMEAN                                                                   -0.6447         -0.6447             0             0             0             0