As described, counting sort is not an in-place algorithm; even disregarding the count array, it needs separate input and output arrays. It is possible to modify the algorithm so that it places the items into sorted order within the same array that was given to it as the input, using only the count array as auxiliary storage; however, the modified in-place version of counting sort is not stable.[3]
I do have an in-place implementation for uint32_t values, I just don't know if it is possible for generic values.
I'm not sure if this is actually possible? Wikipedia says it is...
https://en.wikipedia.org/wiki/Counting_sort
I do have an in-place implementation for uint32_t values, I just don't know if it is possible for generic values.