sezero / quakespasm

QuakeSpasm -- A modern, cross-platform Quake game engine based on FitzQuake.
https://sourceforge.net/projects/quakespasm/
GNU General Public License v2.0
224 stars 96 forks source link

*Sky_ClipPoly: MAX_CLIP_VERTS* error in `lim_daviddg` map #103

Closed alexey-lysiuk closed 1 month ago

alexey-lysiuk commented 2 months ago

Another limit exceeding issue caused by lim_daviddg map from Liminal Spaces Jam. Originally reported here.

Load the map, and do setpos -256 0 0 in console. It trigger the given fatal error in Sky_ClipPoly(). nump variable is 83 which greater than 62 (MAX_CLIP_VERTS - 2).

Here is bsp file only. It's the same file as in #100.

sezero commented 2 months ago

Even with bumping MAX_CLIP_VERTS to 128, when I noclip'ed around after that setpos, I immediately hit the error with nump==157.

@ericwa : Should we bump that MAX_CLIP_VERTS to 256 which seem to work?

CC'ing @vsonnier : vkquake has the same code in place.

vsonnier commented 1 month ago

Thanks @sezero for the head-up ! (Un)Fortunatly, It seems I cannot reproduce the bug on vkQuake mainly because.. Sky_ClipPoly() is never called, guarded by a lot of outer conditions, to the point I wonder if this function is ever called in practice on any map.

Even if I hack around it to force the trigger of Sky_ClipPoly() I only managed nump <= 5 in @alexey-lysiuk example.

Boosting MAX_CLIP_VERTS seems harmless enough though. In any case, I'll follow QS change.

sezero commented 1 month ago

OK @vsonnier, thanks.

@ericwa, @andrei-drexler: thoughts?

ericwa commented 1 month ago

Raising to 256 seems ok, I will check if there’s a bug on my end in Ericw-tools alpha builds leading to these excessively many-sided faces showing up.

ericwa commented 1 month ago

I will check if there’s a bug on my end in Ericw-tools alpha builds leading to these excessively many-sided faces showing up.

I checked and you would have to specify -maxedges 0 to ericw-tools qbsp to get these faces (default behaviour is to chop faces to 64 sides if they exceed that for some reason)

vsonnier commented 1 month ago

I checked and you would have to specify -maxedges 0 to ericw-tools qbsp to get these faces (default behaviour is to chop faces to 64 sides if they exceed that for some reason)

Why not clamping -maxedges arguments to max = 256 for instance, whatever the input value ? Otherwise, it is just a matter of time before another map exceeed the port hardcoded value by using -maxedges 0

Sorry if that sound silly, I don't know any thing of map building :)

sezero commented 1 month ago

Do we have a better solution that doesn't rape the stack?

ericwa commented 1 month ago

Hunk alloc + FreeToMark

vsonnier commented 1 month ago

Hunk alloc + FreeToMark

Fine, for vkQuake I've done so using dynamic allocation (in fact a combination of _alloca and malloc underneath the macros) : https://github.com/Novum/vkQuake/commit/0314204399ce563d22121dc74dc80c4376341ba9

As I said above on vkQuake Sky_ClipPoly() doesn't seem to be called often, if ever.

But what about pratical performance impact of the other QS using Hunk alloc vs. stack ?

sezero commented 1 month ago

@vsonnier: I tried applying the patch inlined below to qs, based on relevant vkquake commits. The alloca path is good. But the malloc path crashes miserably. (Tested by hard-coding on_heap to 1 or 0.) Will try investigating further, but if you see anything before me, please drop a note here.

EDIT: Patch updated according to the +2 comments in the next reply

gl_sky3.patch.txt

diff --git a/Quake/gl_sky.c b/Quake/gl_sky.c
index a347d97..c9b11f4 100644
--- a/Quake/gl_sky.c
+++ b/Quake/gl_sky.c
@@ -495,14 +495,16 @@ static void Sky_ClipPoly (int nump, vec3_t vecs, int stage)
    float   *v;
    qboolean    front, back;
    float   d, e;
-   float   dists[MAX_CLIP_VERTS];
-   int     sides[MAX_CLIP_VERTS];
-   vec3_t  newv[2][MAX_CLIP_VERTS];
    int     newc[2];
    int     i, j;

-   if (nump > MAX_CLIP_VERTS-2)
-       Sys_Error ("Sky_ClipPoly: MAX_CLIP_VERTS");
+   const int max_clip_verts = nump + 2;
+   const int on_heap = max_clip_verts > MAX_CLIP_VERTS;
+   int *sides;
+   float   *dists;
+   vec3_t  *newv_0;
+   vec3_t  *newv_1;
+
    if (stage == 6) // fully clipped
    {
        Sky_ProjectPoly (nump, vecs);
@@ -511,6 +513,10 @@ static void Sky_ClipPoly (int nump, vec3_t vecs, int stage)

    front = back = false;
    norm = skyclip[stage];
+
+   sides = (int *) (on_heap ? malloc(max_clip_verts * sizeof(int)) : alloca(max_clip_verts * sizeof(int)));
+   dists = (float *) (on_heap ? malloc(max_clip_verts * sizeof(float)) : alloca(max_clip_verts * sizeof(float)));
+
    for (i=0, v = vecs ; i<nump ; i++, v+=3)
    {
        d = DotProduct (v, norm);
@@ -532,6 +538,10 @@ static void Sky_ClipPoly (int nump, vec3_t vecs, int stage)
    if (!front || !back)
    {   // not clipped
        Sky_ClipPoly (nump, vecs, stage+1);
+       if (on_heap) {
+           free(dists);
+           free(sides);
+       }
        return;
    }

@@ -541,22 +551,26 @@ static void Sky_ClipPoly (int nump, vec3_t vecs, int stage)
    VectorCopy (vecs, (vecs+(i*3)) );
    newc[0] = newc[1] = 0;

+   // 2-dim vec3_t  newv[2][MAX_CLIP_VERTS]; as 2 arrays
+   newv_0 = (vec3_t *) (on_heap ? malloc(max_clip_verts * sizeof(vec3_t)) : alloca(max_clip_verts * sizeof(vec3_t)));
+   newv_1 = (vec3_t *) (on_heap ? malloc(max_clip_verts * sizeof(vec3_t)) : alloca(max_clip_verts * sizeof(vec3_t)));
+
    for (i=0, v = vecs ; i<nump ; i++, v+=3)
    {
        switch (sides[i])
        {
        case SIDE_FRONT:
-           VectorCopy (v, newv[0][newc[0]]);
+           VectorCopy (v, newv_0[newc[0]]);
            newc[0]++;
            break;
        case SIDE_BACK:
-           VectorCopy (v, newv[1][newc[1]]);
+           VectorCopy (v, newv_1[newc[1]]);
            newc[1]++;
            break;
        case SIDE_ON:
-           VectorCopy (v, newv[0][newc[0]]);
+           VectorCopy (v, newv_0[newc[0]]);
            newc[0]++;
-           VectorCopy (v, newv[1][newc[1]]);
+           VectorCopy (v, newv_1[newc[1]]);
            newc[1]++;
            break;
        }
@@ -568,16 +582,24 @@ static void Sky_ClipPoly (int nump, vec3_t vecs, int stage)
        for (j=0 ; j<3 ; j++)
        {
            e = v[j] + d*(v[j+3] - v[j]);
-           newv[0][newc[0]][j] = e;
-           newv[1][newc[1]][j] = e;
+           newv_0[newc[0]][j] = e;
+           newv_1[newc[1]][j] = e;
        }
        newc[0]++;
        newc[1]++;
    }

    // continue
-   Sky_ClipPoly (newc[0], newv[0][0], stage+1);
-   Sky_ClipPoly (newc[1], newv[1][0], stage+1);
+   Sky_ClipPoly (newc[0], newv_0[0], stage+1);
+   Sky_ClipPoly (newc[1], newv_1[0], stage+1);
+
+   if (on_heap)
+   {
+       free(dists);
+       free(sides);
+       free(newv_0);
+       free(newv_1);
+   }
 }

 /*
@@ -587,9 +609,6 @@ Sky_ProcessPoly
 */
 void Sky_ProcessPoly (glpoly_t *p)
 {
-   int         i;
-   vec3_t      verts[MAX_CLIP_VERTS];
-
    //draw it
    DrawGLPoly(p);
    rs_brushpasses++;
@@ -597,9 +616,20 @@ void Sky_ProcessPoly (glpoly_t *p)
    //update sky bounds
    if (!r_fastsky.value)
    {
-       for (i=0 ; i<p->numverts ; i++)
+       const int max_clip_verts = p->numverts + 2;
+       const int num_verts = p->numverts;
+       const int on_heap = max_clip_verts > MAX_CLIP_VERTS;
+       vec3_t *verts = (vec3_t *) (on_heap ?
+                malloc(max_clip_verts * sizeof(vec3_t)) :
+                alloca(max_clip_verts * sizeof(vec3_t)));
+       int i = 0;
+
+       for ( ; i < num_verts; i++) {
            VectorSubtract (p->verts[i], r_origin, verts[i]);
-       Sky_ClipPoly (p->numverts, verts[0], 0);
+       }
+       Sky_ClipPoly (num_verts, verts[0], 0);
+
+       if (on_heap) free(verts);
    }
 }
sezero commented 1 month ago

OK, as far as I can see, that + 2 needs to be taken into account back in Sky_ProcessPoly, i.e. if I allocate +2 more elements in Sky_ProcessPoly it seems to work smoothly. The relevant change for vkquake should be like the following -- @vsonnier: do you concur?

diff --git a/Quake/gl_sky.c b/Quake/gl_sky.c
index b49c60e..272548b 100644
--- a/Quake/gl_sky.c
+++ b/Quake/gl_sky.c
@@ -681,16 +681,17 @@ void Sky_ProcessPoly (cb_context_t *cbx, glpoly_t *p, float color[3])
    // update sky bounds
    if (need_bounds)
    {
-       const size_t MAX_CLIP_VERTS = p->numverts;
+       const size_t MAX_CLIP_VERTS = p->numverts + 2;
+       const int num_verts = p->numverts;

        TEMP_ALLOC (vec3_t, verts, MAX_CLIP_VERTS);

-       for (size_t i = 0; i < MAX_CLIP_VERTS; i++)
+       for (int i = 0; i < num_verts; i++)
        {
            poly_vert = &p->verts[0][0] + (i * VERTEXSIZE);
            VectorSubtract (poly_vert, r_origin, verts[i]);
        }
-       Sky_ClipPoly (MAX_CLIP_VERTS, verts[0], 0);
+       Sky_ClipPoly (num_verts, verts[0], 0);

        TEMP_FREE (verts);
    } 
vsonnier commented 1 month ago

Thanks @sezero for your thorough investigation. At first glance your change looks inconsistent, why making verts bigger than num_verts while the iteration looks like:

for (size_t i = 0; i < num_verts; i++)
{
...
    VectorSubtract (poly_vert, r_origin, verts[i]);
}

Well the reason is that VectorSubtract is actually a macro expanding to: (thank you Visual Studio)

do {
    (verts[i])[0] = (poly_vert)[0] - (r_origin)[0]; (verts[i])[1] = (poly_vert)[1] - (r_origin)[1]; (verts[i])[2] = (poly_vert)[2] - (r_origin)[2];
} while (0)

So the farther access of verts array is actually (verts[num_elements])[2] which the compiler happily take and interpret as verts[num_elements + 2] apparently ????

My C is not good enough to see how this could be legal or not. Does the compiler even issue a warning ?

And this explains those magical +2 in the rest of Sky_ClipPoly, this is because the usage of Vector-like macros.

sezero commented 1 month ago

The problem appears not in Sky_ProcessPoly, but Sky_ClipPoly where it does VectorCopy (vecs, (vecs+(i*3)) ); which results in a buffer overrun: I noticed that by running qs under valgrind.

sezero commented 1 month ago

I just applied the gl_sky3.patch from above to qs as e35ecad1847de7b2d3382ce180c145015cc1e379

@vsonnier: Waiting for your further comments / replies to apply the vkquake patch and close this.

vsonnier commented 1 month ago

Damn maybe I'm dense, but how that would even work ?

Sky_ClipPoly (int nump, vec3_t vecs, int stage) {

//so `vecs` is just a parameter, passed-by-copy...
//...

for (i = 0, v = vecs; i < nump; i++, v += 3)
{
...
}
// at that point, i = nump ?
// and so then later so we finally have :
//...
VectorCopy (vecs, (vecs + (nump * 3))); // how (vecs + (nump * 3))[n] is even a valid access, for any n ?
sezero commented 1 month ago

The vecs parameter to Sky_ClipPoly is vec3_t and compiler takes that as float *, that's why it works and that's why access in VectorCopy call is not (vecs + (nump * 3))[n] but just (vecs + (nump * 3))

sezero commented 1 month ago

Maybe we should just change Sky_ClipPoly to take vecs as float * instead of vec3_t to avoid confusions?

vsonnier commented 1 month ago

Not fully C-aware humans like me (I'm an Ada dev, imagine my trauma :)) would love this, now an appropriate coment in the code like you just explained to me is probably enough. And the QS code will stay close of the other ports that way. Your choice.

sezero commented 1 month ago

OK, applied the +2 change to vkquake. As for any documentation / other changes or improvements: leaving to you -- I will probably port back to qs.

Closing this as fixed now.

vsonnier commented 1 month ago

This is great, thank you for your help and patience !

sezero commented 1 month ago

This is great, thank you for your help and patience !

Well, my thanks to you: I based the qs patch on tour work