Open cslr opened 1 week ago
Is this radix sort based on SIMD? Do you have any performance data to compare against AVX512/AVX2 qsort?
The radix sort implementation for floating point numbers is in my Dinrhiw2 library (src/tst/conv_test.cpp). It is written in C++ so there is no SIMD/AVX512/AVX2 implementation. Code transforms first floating point numbers to integers which keep ordering of floats, radix sorts integers and then converts integers back to floating point numbers. I have compared execution times to C's qsort and code outperforms qsort when the number of floats is quite large.
Conversion code and radix sort should be rather straightforward to implement in SIMD.
Conversion code and radix sort should be rather straightforward to implement in SIMD.
SIMD qsort is an order of magnitude faster than C's qsort. I would be curious to know how a SIMD radix sort would compare with SIMD qsort. I would be open to including a SIMD version of radix sort, provided it has the performance gains.
I have implemented radix sorting for floating point numbers (float and double) which is O(N) instead of O(N*log(N)) (qsort).
Are you interested? It is faster than qsort with large values of N. (last time I tested, N was something like 10^6 if I remember correctly)