Closed alexey-lysiuk closed 4 months ago
Even with bumping MAX_CLIP_VERTS
to 128, when I noclip'ed around after
that setpos, I immediately hit the error with nump==157
.
@ericwa : Should we bump that MAX_CLIP_VERTS
to 256 which seem to work?
CC'ing @vsonnier : vkquake has the same code in place.
Thanks @sezero for the head-up ! (Un)Fortunatly, It seems I cannot reproduce the bug on vkQuake mainly because.. Sky_ClipPoly()
is never called, guarded by a lot of outer conditions, to the point I wonder if this function is ever called in practice on any map.
Even if I hack around it to force the trigger of Sky_ClipPoly()
I only managed nump <= 5
in @alexey-lysiuk example.
Boosting MAX_CLIP_VERTS
seems harmless enough though. In any case, I'll follow QS change.
OK @vsonnier, thanks.
@ericwa, @andrei-drexler: thoughts?
Raising to 256 seems ok, I will check if there’s a bug on my end in Ericw-tools alpha builds leading to these excessively many-sided faces showing up.
I will check if there’s a bug on my end in Ericw-tools alpha builds leading to these excessively many-sided faces showing up.
I checked and you would have to specify -maxedges 0
to ericw-tools qbsp to get these faces (default behaviour is to chop faces to 64 sides if they exceed that for some reason)
I checked and you would have to specify -maxedges 0 to ericw-tools qbsp to get these faces (default behaviour is to chop faces to 64 sides if they exceed that for some reason)
Why not clamping -maxedges
arguments to max = 256 for instance, whatever the input value ? Otherwise, it is just a matter of time before another map exceeed the port hardcoded value by using -maxedges 0
Sorry if that sound silly, I don't know any thing of map building :)
Do we have a better solution that doesn't rape the stack?
Hunk alloc + FreeToMark
Hunk alloc + FreeToMark
Fine, for vkQuake I've done so using dynamic allocation (in fact a combination of _alloca and malloc underneath the macros) : https://github.com/Novum/vkQuake/commit/0314204399ce563d22121dc74dc80c4376341ba9
As I said above on vkQuake Sky_ClipPoly() doesn't seem to be called often, if ever.
But what about pratical performance impact of the other QS using Hunk alloc vs. stack ?
@vsonnier: I tried applying the patch inlined below to qs, based on relevant vkquake commits. The alloca
path is good. But the malloc
path crashes miserably. (Tested by hard-coding on_heap
to 1 or 0.) Will try investigating further, but if you see anything before me, please drop a note here.
EDIT: Patch updated according to the +2
comments in the next reply
diff --git a/Quake/gl_sky.c b/Quake/gl_sky.c
index a347d97..c9b11f4 100644
--- a/Quake/gl_sky.c
+++ b/Quake/gl_sky.c
@@ -495,14 +495,16 @@ static void Sky_ClipPoly (int nump, vec3_t vecs, int stage)
float *v;
qboolean front, back;
float d, e;
- float dists[MAX_CLIP_VERTS];
- int sides[MAX_CLIP_VERTS];
- vec3_t newv[2][MAX_CLIP_VERTS];
int newc[2];
int i, j;
- if (nump > MAX_CLIP_VERTS-2)
- Sys_Error ("Sky_ClipPoly: MAX_CLIP_VERTS");
+ const int max_clip_verts = nump + 2;
+ const int on_heap = max_clip_verts > MAX_CLIP_VERTS;
+ int *sides;
+ float *dists;
+ vec3_t *newv_0;
+ vec3_t *newv_1;
+
if (stage == 6) // fully clipped
{
Sky_ProjectPoly (nump, vecs);
@@ -511,6 +513,10 @@ static void Sky_ClipPoly (int nump, vec3_t vecs, int stage)
front = back = false;
norm = skyclip[stage];
+
+ sides = (int *) (on_heap ? malloc(max_clip_verts * sizeof(int)) : alloca(max_clip_verts * sizeof(int)));
+ dists = (float *) (on_heap ? malloc(max_clip_verts * sizeof(float)) : alloca(max_clip_verts * sizeof(float)));
+
for (i=0, v = vecs ; i<nump ; i++, v+=3)
{
d = DotProduct (v, norm);
@@ -532,6 +538,10 @@ static void Sky_ClipPoly (int nump, vec3_t vecs, int stage)
if (!front || !back)
{ // not clipped
Sky_ClipPoly (nump, vecs, stage+1);
+ if (on_heap) {
+ free(dists);
+ free(sides);
+ }
return;
}
@@ -541,22 +551,26 @@ static void Sky_ClipPoly (int nump, vec3_t vecs, int stage)
VectorCopy (vecs, (vecs+(i*3)) );
newc[0] = newc[1] = 0;
+ // 2-dim vec3_t newv[2][MAX_CLIP_VERTS]; as 2 arrays
+ newv_0 = (vec3_t *) (on_heap ? malloc(max_clip_verts * sizeof(vec3_t)) : alloca(max_clip_verts * sizeof(vec3_t)));
+ newv_1 = (vec3_t *) (on_heap ? malloc(max_clip_verts * sizeof(vec3_t)) : alloca(max_clip_verts * sizeof(vec3_t)));
+
for (i=0, v = vecs ; i<nump ; i++, v+=3)
{
switch (sides[i])
{
case SIDE_FRONT:
- VectorCopy (v, newv[0][newc[0]]);
+ VectorCopy (v, newv_0[newc[0]]);
newc[0]++;
break;
case SIDE_BACK:
- VectorCopy (v, newv[1][newc[1]]);
+ VectorCopy (v, newv_1[newc[1]]);
newc[1]++;
break;
case SIDE_ON:
- VectorCopy (v, newv[0][newc[0]]);
+ VectorCopy (v, newv_0[newc[0]]);
newc[0]++;
- VectorCopy (v, newv[1][newc[1]]);
+ VectorCopy (v, newv_1[newc[1]]);
newc[1]++;
break;
}
@@ -568,16 +582,24 @@ static void Sky_ClipPoly (int nump, vec3_t vecs, int stage)
for (j=0 ; j<3 ; j++)
{
e = v[j] + d*(v[j+3] - v[j]);
- newv[0][newc[0]][j] = e;
- newv[1][newc[1]][j] = e;
+ newv_0[newc[0]][j] = e;
+ newv_1[newc[1]][j] = e;
}
newc[0]++;
newc[1]++;
}
// continue
- Sky_ClipPoly (newc[0], newv[0][0], stage+1);
- Sky_ClipPoly (newc[1], newv[1][0], stage+1);
+ Sky_ClipPoly (newc[0], newv_0[0], stage+1);
+ Sky_ClipPoly (newc[1], newv_1[0], stage+1);
+
+ if (on_heap)
+ {
+ free(dists);
+ free(sides);
+ free(newv_0);
+ free(newv_1);
+ }
}
/*
@@ -587,9 +609,6 @@ Sky_ProcessPoly
*/
void Sky_ProcessPoly (glpoly_t *p)
{
- int i;
- vec3_t verts[MAX_CLIP_VERTS];
-
//draw it
DrawGLPoly(p);
rs_brushpasses++;
@@ -597,9 +616,20 @@ void Sky_ProcessPoly (glpoly_t *p)
//update sky bounds
if (!r_fastsky.value)
{
- for (i=0 ; i<p->numverts ; i++)
+ const int max_clip_verts = p->numverts + 2;
+ const int num_verts = p->numverts;
+ const int on_heap = max_clip_verts > MAX_CLIP_VERTS;
+ vec3_t *verts = (vec3_t *) (on_heap ?
+ malloc(max_clip_verts * sizeof(vec3_t)) :
+ alloca(max_clip_verts * sizeof(vec3_t)));
+ int i = 0;
+
+ for ( ; i < num_verts; i++) {
VectorSubtract (p->verts[i], r_origin, verts[i]);
- Sky_ClipPoly (p->numverts, verts[0], 0);
+ }
+ Sky_ClipPoly (num_verts, verts[0], 0);
+
+ if (on_heap) free(verts);
}
}
OK, as far as I can see, that + 2
needs to be taken into account back in Sky_ProcessPoly
, i.e. if I allocate +2 more elements in Sky_ProcessPoly
it seems to work smoothly. The relevant change for vkquake should be like the following -- @vsonnier: do you concur?
diff --git a/Quake/gl_sky.c b/Quake/gl_sky.c
index b49c60e..272548b 100644
--- a/Quake/gl_sky.c
+++ b/Quake/gl_sky.c
@@ -681,16 +681,17 @@ void Sky_ProcessPoly (cb_context_t *cbx, glpoly_t *p, float color[3])
// update sky bounds
if (need_bounds)
{
- const size_t MAX_CLIP_VERTS = p->numverts;
+ const size_t MAX_CLIP_VERTS = p->numverts + 2;
+ const int num_verts = p->numverts;
TEMP_ALLOC (vec3_t, verts, MAX_CLIP_VERTS);
- for (size_t i = 0; i < MAX_CLIP_VERTS; i++)
+ for (int i = 0; i < num_verts; i++)
{
poly_vert = &p->verts[0][0] + (i * VERTEXSIZE);
VectorSubtract (poly_vert, r_origin, verts[i]);
}
- Sky_ClipPoly (MAX_CLIP_VERTS, verts[0], 0);
+ Sky_ClipPoly (num_verts, verts[0], 0);
TEMP_FREE (verts);
}
Thanks @sezero for your thorough investigation.
At first glance your change looks inconsistent, why making verts
bigger than num_verts
while the iteration looks like:
for (size_t i = 0; i < num_verts; i++)
{
...
VectorSubtract (poly_vert, r_origin, verts[i]);
}
Well the reason is that VectorSubtract
is actually a macro expanding to: (thank you Visual Studio)
do {
(verts[i])[0] = (poly_vert)[0] - (r_origin)[0]; (verts[i])[1] = (poly_vert)[1] - (r_origin)[1]; (verts[i])[2] = (poly_vert)[2] - (r_origin)[2];
} while (0)
So the farther access of verts
array is actually (verts[num_elements])[2]
which the compiler happily take and interpret as verts[num_elements + 2]
apparently ????
My C is not good enough to see how this could be legal or not. Does the compiler even issue a warning ?
And this explains those magical +2
in the rest of Sky_ClipPoly
, this is because the usage of Vector-like macros.
The problem appears not in Sky_ProcessPoly
, but Sky_ClipPoly
where it does VectorCopy (vecs, (vecs+(i*3)) );
which results in a buffer overrun: I noticed that by running qs under valgrind.
I just applied the gl_sky3.patch from above to qs as e35ecad1847de7b2d3382ce180c145015cc1e379
@vsonnier: Waiting for your further comments / replies to apply the vkquake patch and close this.
Damn maybe I'm dense, but how that would even work ?
Sky_ClipPoly (int nump, vec3_t vecs, int stage) {
//so `vecs` is just a parameter, passed-by-copy...
//...
for (i = 0, v = vecs; i < nump; i++, v += 3)
{
...
}
// at that point, i = nump ?
// and so then later so we finally have :
//...
VectorCopy (vecs, (vecs + (nump * 3))); // how (vecs + (nump * 3))[n] is even a valid access, for any n ?
The vecs
parameter to Sky_ClipPoly
is vec3_t
and compiler takes that as float *
, that's why it works and that's why access in VectorCopy
call is not (vecs + (nump * 3))[n]
but just (vecs + (nump * 3))
Maybe we should just change Sky_ClipPoly
to take vecs
as float *
instead of vec3_t
to avoid confusions?
Not fully C-aware humans like me (I'm an Ada dev, imagine my trauma :)) would love this, now an appropriate coment in the code like you just explained to me is probably enough. And the QS code will stay close of the other ports that way. Your choice.
OK, applied the +2
change to vkquake. As for any documentation / other changes or improvements: leaving to you -- I will probably port back to qs.
Closing this as fixed now.
This is great, thank you for your help and patience !
This is great, thank you for your help and patience !
Well, my thanks to you: I based the qs patch on tour work
Another limit exceeding issue caused by
lim_daviddg map
from Liminal Spaces Jam. Originally reported here.Load the map, and do
setpos -256 0 0
in console. It trigger the given fatal error inSky_ClipPoly()
.nump
variable is 83 which greater than 62 (MAX_CLIP_VERTS - 2
).Here is bsp file only. It's the same file as in #100.