scandum / fluxsort

A fast branchless stable quicksort / mergesort hybrid that is highly adaptive.
The Unlicense
697 stars 23 forks source link

Overflow in `cross_merge` #10

Closed SaculRennorb closed 3 weeks ago

SaculRennorb commented 3 weeks ago

The source was compiled using MSCV 19.39.33523.

While integrating this library into a new project I might have found an oob in the last loop of cross_merge:

https://github.com/scandum/fluxsort/blob/93aac374c1bec5265efd3b8a593a0c2246473ea3/src/quadsort.c#L512-L515 called from https://github.com/scandum/fluxsort/blob/93aac374c1bec5265efd3b8a593a0c2246473ea3/src/fluxsort.c#L205

With specific, large input arrays (492052 elements in the repro case) this overflows the destination buffer of len(swap) - half1 = 492052 - 246026 = 246026, and by quite a bit I might add:

image

I have a bunch of tests that work perfectly fine, including ones that just try random sequences of numbers, but this specific input fails. It also doesnt seem to ony be a size issue, because other tests with 230k elements work fine. I was not able to reduce the crash input, but maybe something like quickcheck could find a more minimal case.

I purposefully did not use the def'ed version of cmp because the real usecase is more complex.

Unfortunately I do not understand the implementation well enough to point at why this happens, but i will provide the almost 500k inputs for reproduction:

#define FUNC(f) f
#define VAR long long

typedef int CMPFUNC (VAR* a, VAR* b);

#include "sort.c" // this is just quadsort and fluxsort glued together

#include <stdio.h>

int cmp(VAR* a, VAR* b) { return *a - *b; }

int verify(VAR* array, int len)
{
    for(int i = 0; i < len - 1; i++)
    {
        if(cmp(array + i, array + i + 1) > 0) {
            printf("Sorting breaks at index %d (%lld > %lld)", i, array[i], array[i + 1]);
            return 1;
        }
    }

    printf("Everything sorted properly.");
    return 0;
}

long long data[] = {
 // zip input.txt content here ------------------------------------------------------------------------------------
};
int main()
{
    fluxsort(data, sizeof(data) / sizeof(data[0]), cmp);

    return verify(data, sizeof(data) / sizeof(data[0]));
}

Neither pastebin nor gh allow 3.5 meg of text, so i attached the input as a zipped text file. And while I was at it I zipped the whole test project aswell, in case that is of any use.

tests.zip

scandum commented 3 weeks ago

Hi,

Before looking deeper into this, the problem is likely with the comparison function:

int cmp(VAR* a, VAR* b) { return *a - *b; }

This should be:

int cmp(VAR* a, VAR* b) { return *a > *b; }

This is still not a proper comparison function though, and you'd want to use something like:

int cmp(const void * a, const void * b)
{
      const int l = *(const int *)a;
      const int r = *(const int *)b;

      return l > r;
//   return (l > r) - (l < r);
}

To use the comparison function with qsort() you need to use return (l > r) - (l < r); which is a bit slower. Using l - r is only valid if all integers being compared are positive.

SaculRennorb commented 3 weeks ago

Thank you for the quick reply.

And yea, even though I was looking into it for days now I just found the same problem almost immediately after posting the issue. I was trying to figure out why this would even be a problem, since you compare the return value of cmp everywhere it is called, but I just realized that the downcast from 64bit to 32bit during the return can turn negative 64bit numbers into positive 32bit ones...

The repro case works with a proper comparison function, ill just have to sort that out. Thank you for the rubber ducky ~ (and for the solution ofcourse)