Open stress-tess opened 1 year ago
Things to consider:
coargsort([strings.get_lengths(), strings.to_uint_array()])
segmentedSort
make sense where we first group on string length and then sort within the groups? If so, we might want to consider a different sorting algorithm since we'd likely be sorting on smaller chunks of data than radix sort was built/optimized for
Overarching issue tracking string groupby performance. The way groupby is implemented relies heavily on
unique
andLSDRadixSort
Note: @ronawho is assigned only for his awareness and so he can weigh in on my wild ideas and provide suggestions if he wants to. I don't expect him to write any code for this