libsdl-org / SDL

Simple Directmedia Layer
https://libsdl.org
zlib License
9.92k stars 1.84k forks source link

Renderer line drawing ongoing issues #5061

Closed rtrussell closed 2 years ago

rtrussell commented 2 years ago

It appears that when running on Windows with the OpenGL backend (not Direct3D) SDL_RenderDrawLine() is not plotting the final endpoint. So for example:

SDL_RenderDrawLine(renderer, 200, 300, 300, 300);

is not plotting the pixel at 300,300 at all. This worked correctly in SDL 2.0.16.

The program below will demonstrate this; it should draw a green square without any black line, but with SDL 2.0.18 and Windows 10 it plots this: SDL_square Commenting out the OpenGL hint results in correct behavior.

/**************************************************************\
* Testcase to demonstrate a possible regression in SDL 2.0.18  *
* when running with the OpenGL rendering backend on Windows 10 *
\**************************************************************/

#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include "SDL2/SDL.h"

#define SCREEN_WIDTH  640
#define SCREEN_HEIGHT 500

int main(int argc, char* argv[])
{
int y, running = 1;
SDL_Event ev;
SDL_Window *window;
SDL_Renderer *renderer;

if (SDL_Init(SDL_INIT_VIDEO | SDL_INIT_TIMER | SDL_INIT_EVENTS) != 0)
{
    SDL_ShowSimpleMessageBox(SDL_MESSAGEBOX_ERROR,
        "Testcase", SDL_GetError(), NULL);
    return 1;
}

    SDL_SetHint(SDL_HINT_RENDER_DRIVER, "opengl");

window = SDL_CreateWindow("Testcase",  SDL_WINDOWPOS_CENTERED,  SDL_WINDOWPOS_CENTERED, 
        SCREEN_WIDTH, SCREEN_HEIGHT, SDL_WINDOW_OPENGL | SDL_WINDOW_RESIZABLE);
if (window == NULL)
{
    SDL_ShowSimpleMessageBox(SDL_MESSAGEBOX_ERROR,
        "Testcase", SDL_GetError(), NULL);
    SDL_Quit();
    return 5;
}

renderer = SDL_CreateRenderer(window, -1, SDL_RENDERER_ACCELERATED |SDL_RENDERER_PRESENTVSYNC);
if (renderer == NULL)
{
    SDL_DestroyWindow(window);
    SDL_ShowSimpleMessageBox(SDL_MESSAGEBOX_ERROR,
        "Testcase", SDL_GetError(), NULL);
    SDL_Quit();
    return 6;
}

while (running)
{
    while (SDL_PollEvent(&ev))
        switch (ev.type)
        {
            case SDL_QUIT:
                running = 0;
        }

// Clear the window to white:
    SDL_SetRenderDrawColor(renderer, 255, 255, 255, 255);
    SDL_RenderClear(renderer);

// Draw a black vertical line from 300,200 to 300,300:

    SDL_SetRenderDrawColor(renderer, 0, 0, 0, 255);
    SDL_RenderDrawLine(renderer, 300, 200, 300, 300);

// Draw a series of green horizontal lines from x = 200 to x = 300,
// these should overwrite the previously-drawn black line:

    SDL_SetRenderDrawColor(renderer, 0, 255, 0, 255);
    for (y = 200; y <= 300 ; y++)
        SDL_RenderDrawLine(renderer, 200, y, 300, y);

    SDL_RenderPresent(renderer);
}

SDL_DestroyRenderer(renderer);
SDL_DestroyWindow(window);
SDL_Quit();

exit(0);
}
slime73 commented 2 years ago

I wonder if the best play is to change line rendering to use primitives (a quad per line via 4 vertices plus an index buffer to turn them into two triangles). It should avoid subtle line rendering differences across APIs/drivers and would give SDL's code more control - it's what LÖVE does and it seems to have worked decently there, at least.

rtrussell commented 2 years ago

I wonder if the best play is to change line rendering to use primitives

I believe Ryan has changed the line rendering in SDL 2.0.18 in an attempt to make it more consistent across backends, but unfortunately it seems to have had the opposite effect with OpenGL on Windows.

slime73 commented 2 years ago

Yeah. The current implementation is still using the graphics API's lines (which are themselves not too consistent) so reimplementing lines using primitives / geometry would avoid that inconsistency at least.

rtrussell commented 2 years ago

Polylines always seem to draw correctly (probably because they are commonly used to draw outline polygons) so I suggested to Ryan some time ago that a reliable way to draw a line inclusive of both endpoints ought to be as a polyline with two segments (one being just a single pixel). I don't know if he ever tried that.

icculus commented 2 years ago

Honestly, I'm willing to try just about anything that will result in me not being assigned this bug again after every new release. :/

1bsyl commented 2 years ago

Here's a version to draw lines based on RenderGeometry as suggested @slime73 similar to MiterJoinPolyline, with no SMOOTHING, and line with thickness 1.0f ... https://github.com/love2d/love/blob/main/src/modules/graphics/Polyline.cpp#L173

It adds other differences but it seems to solve some end points (tested software vs OpenGl on linux )...

int
SDL_RenderDrawLinesF(SDL_Renderer * renderer,
                     const SDL_FPoint * points, int count)
{
    int retval;

    CHECK_RENDERER_MAGIC(renderer, -1);

    if (!points) {
        return SDL_SetError("SDL_RenderDrawLinesF(): Passed NULL points");
    }
    if (count < 2) {
        return 0;
    }

#if DONT_DRAW_WHILE_HIDDEN
    /* Don't draw while we're hidden */
    if (renderer->hidden) {
        return 0;
    }
#endif

    if (renderer->scale.x != 1.0f || renderer->scale.y != 1.0f) {
        return RenderDrawLinesWithRectsF(renderer, points, count);
    }

    if (!(renderer->info.flags & SDL_RENDERER_SOFTWARE) && renderer->QueueGeometry) {
        SDL_bool isstack1;
        SDL_bool isstack2;
        float *xy = SDL_small_alloc(float, 2 * 2 * count, &isstack1);
        int *indices = SDL_small_alloc(int, 6 * (count - 1), &isstack2);

        if (xy && indices) {
            int i;
            float *ptr_xy = xy;
            int *ptr_indices = indices;
            const int xy_stride = 2 * sizeof (float);
            const int num_vertices = 2 * count;
            const int num_indices = 6 * (count - 1);
            const int size_indices = 4;
            int cur_indice = 0;

            const int is_looping = (points[0].x == points[count - 1].x && points[0].y == points[count - 1].y);
            SDL_FPoint before, after, p;
            if (is_looping) {
                before = points[count - 2];
                after = points[1];
            } else {
                /* mirrored of point 1, by point 0 */
                before.x = 2.0f * points[0].x - points[1].x;
                before.y = 2.0f * points[0].y - points[1].y;
                /* mirrored of before last point 1, by last pont */
                after.x = 2.0f * points[count - 1].x - points[count - 2].x;
                after.y = 2.0f * points[count - 1].y - points[count - 2].y;
            }

            p = before;

            for (i = 0; i < count; ++i) {
                SDL_FPoint q, r;
                float norm_qp;
                float norm_qr;
                float dot, coef;
                float offset;
                const float half_width = 0.5f;
                SDL_FPoint u_qp, u_qr;
                SDL_FPoint n_qp; /* normal */

                q = points[i];
                if (i == count - 1) {
                    r = after; 
                } else {
                    r = points[i + 1];
                }

                norm_qp = SDL_sqrtf((p.x - q.x) * (p.x - q.x) + (p.y - q.y) * (p.y - q.y));
                norm_qr = SDL_sqrtf((p.x - r.x) * (p.x - r.x) + (p.y - r.y) * (p.y - r.y));

                u_qp.x = (p.x - q.x) / norm_qp;
                u_qp.y = (p.y - q.y) / norm_qp;
                u_qr.x = (r.x - q.x) / norm_qr;
                u_qr.y = (r.y - q.y) / norm_qr;

                n_qp.x = - u_qp.y;
                n_qp.y = u_qp.x;

                dot = u_qr.x * n_qp.x + u_qr.y * n_qp.y;

                /* translate by q + (half_width, half_width) */
                offset = half_width;
                if (SDL_fabs(dot) < 0.05f) { /* parallel lines */
                    coef = half_width;
                    *ptr_xy++ = q.x + offset - coef * n_qp.x;
                    *ptr_xy++ = q.y + offset - coef * n_qp.y;
                    *ptr_xy++ = q.x + offset + coef * n_qp.x;
                    *ptr_xy++ = q.y + offset + coef * n_qp.y;
                } else {
                    coef = half_width / dot;
                    *ptr_xy++ = q.x + offset - coef * (u_qr.x + u_qp.x);
                    *ptr_xy++ = q.y + offset - coef * (u_qr.y + u_qp.y);
                    *ptr_xy++ = q.x + offset + coef * (u_qr.x + u_qp.x);
                    *ptr_xy++ = q.y + offset + coef * (u_qr.y + u_qp.y);
                }

                p = q;
            }
            for (i = 0; i < count - 1; ++i) {
                *ptr_indices++ = cur_indice + 0;
                *ptr_indices++ = cur_indice + 1;
                *ptr_indices++ = cur_indice + 2;
                *ptr_indices++ = cur_indice + 3;
                *ptr_indices++ = cur_indice + 1;
                *ptr_indices++ = cur_indice + 2;
                cur_indice += 2;
            }

            if (0)
            SDL_RenderDrawPointsF(renderer, xy, num_vertices);

            retval = QueueCmdGeometry(renderer, NULL,
                    xy, xy_stride, (int *) &renderer->color, 0 /* color_stride */, NULL, 0,
                    num_vertices, indices, num_indices, size_indices,
                    1.0f, 1.0f);

            SDL_small_free(xy, isstack1);
            SDL_small_free(indices, isstack2);
        }

    } else {
        retval = QueueCmdDrawLines(renderer, points, count);
    }

    return retval < 0 ? retval : FlushRenderCommandsIfNotBatching(renderer);
}
1bsyl commented 2 years ago

here's a testcase main_diff_drawline.c.txt

open 3 windows. 1/ render a scene with lines, with Software renderer 2/ render a scene with lines, with OpenGL renderer (but you can change the code to use another one) 3/ show the diff 1 vs 2, with colors

1bsyl commented 2 years ago

https://github.com/love2d/love/blob/main/src/modules/graphics/Polyline.cpp#L141 not much comment in code yet, but like the love polyline + drawing.

There are three points: p q r. q is the current line point. p is the previous point, and r is the point right after (with some virtual point at start/end, if the polyline is closed or not, eg is_looping). We replace the polyline with a triangle strip, where q is replaced by two points that are 'halfwidth' far from q, on (p,q,r) bissector. And we draw triangles based on those points. ( if p,q,r are aligned, this is a special/corner case, but simpler for the computation)

TODO: we're not handling two successive points that would be duplicates ..

rtrussell commented 2 years ago

Here's a version to draw lines based on RenderGeometry

I wonder if it would benefit from special-casing horizontal and vertical lines, which are likely to be used as building blocks for more complex shapes (filled polygons, circles, ellipses etc. as in SDL2_gfx).

1bsyl commented 2 years ago

@rtrussell I don't know if we need we special-casing horizontal and vertical lines ... But they can be in polyline, or at end or stop, so this ... and it is already covered. Maybe you can try to recompile SDL with this function to see how it goes for you !

rtrussell commented 2 years ago

Maybe you can try to recompile SDL

I'm not set up to compile SDL from source on Windows, I don't have Visual Studio or anything similar installed. If you can make me a DLL I can try it.

1bsyl commented 2 years ago

@rtrussell I cannot build a DLL neither. but here's another option:

Use this function for testing that can be compiled done inside your code. it's similar to SDL_RenderDrawLinesF ( you just need to use the default scale 1.0 1.0 and not the software renderer)

int
SDL_RenderDrawLinesF__test(SDL_Renderer * renderer,
                     const SDL_FPoint * points, int count)
{
    int retval;
    float scalex, scaley;

    if (!points) {
        return SDL_SetError("SDL_RenderDrawLinesF(): Passed NULL points");
    }
    if (count < 2) {
        return 0;
    }

    SDL_RenderGetScale(renderer, &scalex, &scaley);

    if (scalex != 1.0f || scaley != 1.0f) {
        // TODO
        return -1;
    }

    if (1) {
        float *xy = SDL_malloc(sizeof(float) * 2 * 2 * count);
        int *indices = SDL_malloc(sizeof(int) * 6 * (count - 1));

        if (xy && indices) {
            int i;
            float *ptr_xy = xy;
            int *ptr_indices = indices;
            const int xy_stride = 2 * sizeof (float);
            const int num_vertices = 2 * count;
            const int num_indices = 6 * (count - 1);
            const int size_indices = 4;
            int cur_indice = 0;

            const int is_looping = (points[0].x == points[count - 1].x && points[0].y == points[count - 1].y);
            SDL_FPoint before, after, p;
            if (is_looping) {
                before = points[count - 2];
                after = points[1];
            } else {
                /* mirrored of point 1, by point 0 */
                before.x = 2.0f * points[0].x - points[1].x;
                before.y = 2.0f * points[0].y - points[1].y;
                /* mirrored of before last point 1, by last pont */
                after.x = 2.0f * points[count - 1].x - points[count - 2].x;
                after.y = 2.0f * points[count - 1].y - points[count - 2].y;
            }

            p = before;

            for (i = 0; i < count; ++i) {
                SDL_FPoint q, r;
                float norm_qp;
                float norm_qr;
                float dot, coef;
                float offset;
                const float half_width = 0.5f;
                SDL_FPoint u_qp, u_qr;
                SDL_FPoint n_qp; /* normal */

                q = points[i];
                if (i == count - 1) {
                    r = after; 
                } else {
                    r = points[i + 1];
                }

                norm_qp = SDL_sqrtf((p.x - q.x) * (p.x - q.x) + (p.y - q.y) * (p.y - q.y));
                norm_qr = SDL_sqrtf((p.x - r.x) * (p.x - r.x) + (p.y - r.y) * (p.y - r.y));

                u_qp.x = (p.x - q.x) / norm_qp;
                u_qp.y = (p.y - q.y) / norm_qp;
                u_qr.x = (r.x - q.x) / norm_qr;
                u_qr.y = (r.y - q.y) / norm_qr;

                n_qp.x = - u_qp.y;
                n_qp.y = u_qp.x;

                dot = u_qr.x * n_qp.x + u_qr.y * n_qp.y;

                /* translate by q + (half_width, half_width) */
                offset = half_width;
                if (SDL_fabs(dot) < 0.05f) { /* parallel lines */
                    coef = half_width;
                    *ptr_xy++ = q.x + offset - coef * n_qp.x;
                    *ptr_xy++ = q.y + offset - coef * n_qp.y;
                    *ptr_xy++ = q.x + offset + coef * n_qp.x;
                    *ptr_xy++ = q.y + offset + coef * n_qp.y;
                } else {
                    coef = half_width / dot;
                    *ptr_xy++ = q.x + offset - coef * (u_qr.x + u_qp.x);
                    *ptr_xy++ = q.y + offset - coef * (u_qr.y + u_qp.y);
                    *ptr_xy++ = q.x + offset + coef * (u_qr.x + u_qp.x);
                    *ptr_xy++ = q.y + offset + coef * (u_qr.y + u_qp.y);
                }

                p = q;
            }
            for (i = 0; i < count - 1; ++i) {
                *ptr_indices++ = cur_indice + 0;
                *ptr_indices++ = cur_indice + 1;
                *ptr_indices++ = cur_indice + 2;
                *ptr_indices++ = cur_indice + 3;
                *ptr_indices++ = cur_indice + 1;
                *ptr_indices++ = cur_indice + 2;
                cur_indice += 2;
            }

            if (0)
            SDL_RenderDrawPointsF(renderer, xy, num_vertices);

            SDL_RenderSetScale(renderer, 1.0, 1.0);

            SDL_Color color;
            SDL_GetRenderDrawColor(renderer, &color.r, &color.g, &color.b, &color.a);
            retval = SDL_RenderGeometryRaw(renderer, NULL,
                    xy, xy_stride, (int *) &color, 0 /* color_stride */, NULL, 0,
                    num_vertices, indices, num_indices, size_indices);

            SDL_free(xy);
            SDL_free(indices);
        }

    } else {
//        retval = QueueCmdDrawLines(renderer, points, count);
    }

    return retval;
}
slime73 commented 2 years ago

Here's a version to draw lines based on RenderGeometry as suggested @slime73 similar to MiterJoinPolyline, with no SMOOTHING, and line with thickness 1.0f ... https://github.com/love2d/love/blob/main/src/modules/graphics/Polyline.cpp#L173

For what it's worth LÖVE's miter join lines have a couple known issues (for example there's no fallback to bevel or other join when the angle is really small, which makes the line geometry extend way past where you'd expect in those situations, and lines with duplicate points sometimes don't get rendered), so copying that implementation wholesale might add issues you don't want.

rtrussell commented 2 years ago

Use this function for testing that can be compiled done inside your code.

It's principally SDL2_gfx, not my code, that would need to be altered to carry out the test. I suppose I could edit SDL2_gfxPrimitives.c to call your routine, but it's not a trivial exercise.

1bsyl commented 2 years ago

@rtrussell maybe you do a search and replace, or if you're lucky a preprocessor define

Kartoffelbauer commented 2 years ago

Same problem on Raspberry Pi with kmsdrm backend and OpenGL ES: Lines are always one pixel too short.

rtrussell commented 2 years ago

@rtrussell maybe you do a search and replace

It's not that easy because SDL2_gfx needs SDL_RenderDrawLine() and SDL_RenderDrawLines() (not the float versions). Also, I get "_undefined reference to 'SDLRenderGeometryRaw'" because my header and library files are not up to date (deliberately, because I'm building for multiple platforms, some of which don't have anything later than SDL 2.0.10).

To make it easy to test your code I would need somebody to build me a DLL.

1bsyl commented 2 years ago

@slime73 I didn't copy paste LOVE code, just use the same method. it has very likely the corner with duplicate points. indeed, maybe a fallback to bevel for small angle seems a good idea !

@Kartoffelbauer maybe you can recompile SDL2 from source and use the previous function for testing

@rtrussell let's create an issue so that someone document the DLL creating on window !

rtrussell commented 2 years ago

Here's a version to draw lines based on RenderGeometry as suggested @slime73

I have a major worry that your approach based on RenderGeometry will completely break my app, because it is fundamentally reliant on the line being drawn responding to glLogicOp(). I'm pretty sure that glLogicOp() works only with drawing primitives, not triangles. If that is the case, switching to your approach would be a disaster for me.

slime73 commented 2 years ago

If you're reliant on a library's internal implementation-specific state for your own code, maybe it'll be better in the long run for you to move to something you have more control over? Such as direct OpenGL without SDL_Render, or an OpenGL wrapper library that you can build and modify in your own source code.

In the case of glLogicOp it's been outdated for so long (almost two decades) that I don't know how it works, but triangles are primitives as far as OpenGL is concerned so you probably don't need to worry about that specific function.

rtrussell commented 2 years ago

If you're reliant on a library's internal implementation-specific state for your own code, maybe it'll be better in the long run for you to move to something you have more control over?

Mixing SDL2 functions and direct calls to OpenGL is supported - it's the sole reason for the SDL_RenderFlush() function existing - so although there is no explicit list of which OpenGL functions may be safely called and which may not, as far as I am concerned it is entirely unreasonable to change SDL2 in such a way that what worked before no longer works.

In any case there is no acceptable workaround. Support for 'logical' plotting (AND, OR, XOR) is essential for my app because it's a programming language (which has been around for more than 40 years) which provides that functionality to the programmer.

In the case of glLogicOp it's been outdated for so long

I'm not sure what you mean by "outdated". It's rarely used in modern applications, admittedly, but It's documented and fully supported in OpenGL and in OpenGLES v1.

Ultimately the decision is down to Ryan and Sam, but changing SDL_RenderDrawLine() so fundamentally that glLogicOp() no longer affects it, when it has done from SDL 2.0.0 to 2.0.18, would break my app in a catastrophic way and I might well have no alternative but to abandon it.

slouken commented 2 years ago

We need to fix line drawing as it currently seems impossible to get it pixel perfect on every graphics implementation. However we should leave the original code in there enabled via a hint, so projects like @rtrussell's will continue working.

slime73 commented 2 years ago

Mixing SDL2 functions and direct calls to OpenGL is supported - it's the sole reason for the SDL_RenderFlush() function existing

I'm trying to help you by suggesting long term solutions that avoid needing to worry about any SDL change. SDL's render internals have changed significantly over the years, will change in the future, and different platforms exist and will grow in popularity and use which don't use a specific backend you're using.

In any case there is no acceptable workaround.

but changing SDL_RenderDrawLine() so fundamentally that glLogicOp() no longer affects it

There is nothing I'm aware of to suggest that glLogicOp does not work with triangle primitives. So far as I can tell, triangle-based lines will continue to work with glLogicOp (when legacy OpenGL is used on a platform). Does the GL spec language say otherwise? I'd be very surprised. Logic operations happen after primitives are rasterized so they shouldn't care about what the primitives are.

we should leave the original code in there enabled via a hint,

I don't believe this is needed for this specific case. I'd definitely test any new implementation in a variety of situations before declaring it ready to replace the old stuff, though.

I'm not sure what you mean by "outdated". It's rarely used in modern applications, admittedly, but It's documented and fully supported in OpenGL and in OpenGLES v1.

It doesn't really exist on many modern GPUs as far as I know, and therefore doesn't exist in some newer graphics APIs. Including OpenGL ES post-1.0. It's rarely used these days because shaders have supplanted most use cases, I think.

slouken commented 2 years ago

Here are precompiled DLLs with this change for testing: http://www.libsdl.org/tmp/x86/SDL2.dll http://www.libsdl.org/tmp/x64/SDL2.dll

rtrussell commented 2 years ago

Here are precompiled DLLs with this change for testing:

As far as I can see it's made it worse. I've confirmed that the DLL is reporting 2.0.19, but if I run the testcase that I listed at the start of the thread I'm still seeing the black line, and I'm also seeing it with the OpenGL hint commented out, which I wasn't before. If I've not done something silly, it appears to have made it consistent, but wrong!

rtrussell commented 2 years ago

It doesn't really exist on many modern GPUs as far as I know

I don't know how it's working under the hood, but since it's still documented in the OpenGL 4 docs, and there's no mention of it being deprecated, I can't see that there is a problem with it being used.

I do note that it says "specify a logical pixel operation for rendering" (not just drawing) so it may be that my concern about it not affecting triangles was misplaced.

slouken commented 2 years ago

Here are precompiled DLLs with this change for testing:

As far as I can see it's made it worse. I've confirmed that the DLL is reporting 2.0.19, but if I run the testcase that I listed at the start of the thread I'm still seeing the black line, and I'm also seeing it with the OpenGL hint commented out, which I wasn't before. If I've not done something silly, it appears to have made it consistent, but wrong!

I can reproduce this as well.

1bsyl commented 2 years ago

With the LOVE method I get lots of artifacts. I believe it's ok with thick line, but with a 1 pixel width, it's lot of diff. So I try another thing:

It was quite fast to implement and it gives much better result for me

It would be nice to test:

int
SDL_RenderDrawLinesF(SDL_Renderer * renderer,
                     const SDL_FPoint * points, int count)
{
    int retval;

    CHECK_RENDERER_MAGIC(renderer, -1);

    if (!points) {
        return SDL_SetError("SDL_RenderDrawLinesF(): Passed NULL points");
    }
    if (count < 2) {
        return 0;
    }

#if DONT_DRAW_WHILE_HIDDEN
    /* Don't draw while we're hidden */
    if (renderer->hidden) {
        return 0;
    }
#endif

    if (renderer->scale.x != 1.0f || renderer->scale.y != 1.0f) {
        return RenderDrawLinesWithRectsF(renderer, points, count);
    }

    if (!(renderer->info.flags & SDL_RENDERER_SOFTWARE) && renderer->QueueGeometry) {
        SDL_bool isstack1;
        SDL_bool isstack2;
        /* +1 worst case for bevel */
        float *xy = SDL_small_alloc(float, 4 * 2 * count, &isstack1);
        int *indices = SDL_small_alloc(int, 
                  (4) * 3 * (count - 1)
                + (2) * 3 * (count)
                , &isstack2);

        if (xy && indices) {
            int i;
            float *ptr_xy = xy;
            int *ptr_indices = indices;
            const int xy_stride = 2 * sizeof (float);
            int num_vertices = 0;
            int num_indices = 0;
            const int size_indices = 4;
            int cur_indice = -4;

            const int is_looping = (points[0].x == points[count - 1].x && points[0].y == points[count - 1].y);
            SDL_FPoint p;

            for (i = 0; i < count; ++i) {
                SDL_FPoint q;

                q = points[i];
                q.x += 0.5f;
                q.y += 0.5f;

                *ptr_xy++ = SDL_floorf(q.x);
                *ptr_xy++ = SDL_floorf(q.y);
                *ptr_xy++ = SDL_floorf(q.x) + 1.0f;
                *ptr_xy++ = SDL_floorf(q.y);
                *ptr_xy++ = SDL_floorf(q.x) + 1.0f;
                *ptr_xy++ = SDL_floorf(q.y) + 1.0f;
                *ptr_xy++ = SDL_floorf(q.x);
                *ptr_xy++ = SDL_floorf(q.y) + 1.0f;

                num_vertices += 4;

#define ADD_TRIANGLE(i1, i2, i3)                    \
                *ptr_indices++ = cur_indice + i1;   \
                *ptr_indices++ = cur_indice + i2;   \
                *ptr_indices++ = cur_indice + i3;   \
                num_indices += 3;                   \

                /* Draw the first, only if not a closed polyline */
                if (i || is_looping == 0) {
                    ADD_TRIANGLE(4, 5, 6)
                    ADD_TRIANGLE(4, 6, 7)
                }

                /* first point */
                if (i == 0) {
                    p = q;
                    cur_indice += 4;
                    continue;
                }

                /* draw segments */
                if (p.y == q.y) {
                    if (p.x < q.x) {
                        ADD_TRIANGLE(1, 4, 7)
                        ADD_TRIANGLE(1, 7, 2)
                    } else {
                        ADD_TRIANGLE(5, 0, 3)
                        ADD_TRIANGLE(5, 3, 6)
                    }
                } else if (p.x == q.x) {
                    if (p.y < q.y) {
                        ADD_TRIANGLE(2, 5, 4)
                        ADD_TRIANGLE(2, 4, 3)
                    } else {
                        ADD_TRIANGLE(6, 1, 0)
                        ADD_TRIANGLE(6, 0, 7)
                    }
                } else {
                    if (p.y < q.y) {
                        if (p.x < q.x) {
                            ADD_TRIANGLE(1, 5, 4)
                            ADD_TRIANGLE(1, 4, 2)
                            ADD_TRIANGLE(2, 4, 7)
                            ADD_TRIANGLE(2, 7, 3)
                        } else {
                            ADD_TRIANGLE(4, 0, 5)
                            ADD_TRIANGLE(5, 0, 3)
                            ADD_TRIANGLE(5, 3, 6)
                            ADD_TRIANGLE(6, 3, 2)
                        }
                    } else {
                        if (p.x < q.x) {
                            ADD_TRIANGLE(0, 4, 7)
                            ADD_TRIANGLE(0, 7, 1)
                            ADD_TRIANGLE(1, 7, 6)
                            ADD_TRIANGLE(1, 6, 2)
                        } else {
                            ADD_TRIANGLE(6, 5, 1)
                            ADD_TRIANGLE(6, 1, 0)
                            ADD_TRIANGLE(7, 6, 0)
                            ADD_TRIANGLE(7, 0, 3)
                        }
                    }
                }

                p = q;
                cur_indice += 4;
            }

            retval = QueueCmdGeometry(renderer, NULL,
                    xy, xy_stride, (int *) &renderer->color, 0 /* color_stride */, NULL, 0,
                    num_vertices, indices, num_indices, size_indices,
                    1.0f, 1.0f);

            SDL_small_free(xy, isstack1);
            SDL_small_free(indices, isstack2);
        }

    } else {
        retval = QueueCmdDrawLines(renderer, points, count);
    }

    return retval < 0 ? retval : FlushRenderCommandsIfNotBatching(renderer);
}
slouken commented 2 years ago

It looks like lots of overdraw there. Does it work correctly with blend modes?

1bsyl commented 2 years ago

@slouken no! there are no overdraws, only non intersecting triangles, (unless the polyline cross itself). the only potential overdraws was the first point when the polyline is a loop, which I tested and fixed

1bsyl commented 2 years ago

actually, it the line does a very small angle this can count as an overdraw indeed? , not sure is this is correct or not

slouken commented 2 years ago

@rtrussell, I updated the pre-built DLLs, can you see if this works for your project?

slouken commented 2 years ago

actually, it the line does a very small angle this can count as an overdraw indeed? , not sure is this is correct or not

It's easier to test with SDL_BLENDMODE_ADD and look at the output pixels in an image editing program.

rtrussell commented 2 years ago

@rtrussell, I updated the pre-built DLLs, can you see if this works for your project?

On a quick check it's looking good. The testcase passes, and it's correctly responding to glLogicOp(). I'll run it through some more tests and report back.

rtrussell commented 2 years ago

Everything I've thrown at it has worked. I've not noticed any overdraws, even when using exclusive-or plotting which is a tough test. Impressive.

rtrussell commented 2 years ago

The bad news is that there seems to be a massive performance hit. Using SDL_RenderDrawLines() to draw a 500-pixel-diameter circle, consisting of 360 segments, seems to be taking something like 20 times longer with SDL 2.0.19 than 2.0.16.

If it can't be made faster, that suggests to me that the old method should continue to be used on platforms which have always worked reliably using it (e.g. Emscripten).

1bsyl commented 2 years ago

testdraw2, with manually edition, to only call DrawLine() : testdraw2 1000. before: 1000 fps, now: 700 fps testdraw2 10000. before: 356 fps, now: 74 fps testdraw2 30000. before: 140 fps, now: 25 fps not sure if this is CPU or GPU time though ...

Maybe I'll give a try later to draw less triangles and to rely on the top-left rule

rtrussell commented 2 years ago

I don't know whether I've chosen a particularly unfortunate benchmark, but running this program here (Windows 10, OpenGL backend) gives an approximately 20:1 slowdown in SDL 2.0.19 compared with SDL 2.0.16:

#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <math.h>
#include "SDL2/SDL.h"

#define SCREEN_WIDTH  640
#define SCREEN_HEIGHT 512

int main(int argc, char* argv[])
{
int i, running = 1;
int start, finish;
SDL_Event ev;
SDL_Window *window;
SDL_Renderer *renderer;
SDL_Point pt[361];

if (SDL_Init(SDL_INIT_VIDEO | SDL_INIT_TIMER | SDL_INIT_EVENTS) != 0)
{
    SDL_ShowSimpleMessageBox(SDL_MESSAGEBOX_ERROR,
        "Testcase", SDL_GetError(), NULL);
    return 1;
}

    SDL_SetHint(SDL_HINT_RENDER_DRIVER, "opengl");

window = SDL_CreateWindow("Testcase",  SDL_WINDOWPOS_CENTERED,  SDL_WINDOWPOS_CENTERED, 
        SCREEN_WIDTH, SCREEN_HEIGHT, SDL_WINDOW_OPENGL | SDL_WINDOW_RESIZABLE);
if (window == NULL)
{
    SDL_ShowSimpleMessageBox(SDL_MESSAGEBOX_ERROR,
        "Testcase", SDL_GetError(), NULL);
    SDL_Quit();
    return 5;
}

renderer = SDL_CreateRenderer(window, -1, SDL_RENDERER_ACCELERATED |SDL_RENDERER_PRESENTVSYNC);
if (renderer == NULL)
{
    SDL_DestroyWindow(window);
    SDL_ShowSimpleMessageBox(SDL_MESSAGEBOX_ERROR,
        "Testcase", SDL_GetError(), NULL);
    SDL_Quit();
    return 6;
}

for (i = 0; i <= 361; i++)
    {
    pt[i].x = 320 + 250.0 * cos(0.0174532 * i);
    pt[i].y = 256 + 259.0 * sin(0.0174532 * i);
    }

while (running)
{
    while (SDL_PollEvent(&ev))
        switch (ev.type)
        {
            case SDL_QUIT:
                running = 0;
        }

// Clear the window to white:
    SDL_SetRenderDrawColor(renderer, 255, 255, 255, 255);
    SDL_RenderClear(renderer);

// Draw 1000 green circles, diameter 500 pixels:
    SDL_SetRenderDrawColor(renderer, 0, 255, 0, 255);
    start = SDL_GetTicks();
    for (i = 0; i < 1000; i++)
        SDL_RenderDrawLines(renderer, pt, 361);
    finish = SDL_GetTicks();
    SDL_Log("Elapsed time %i ms\n", finish-start);

    SDL_RenderPresent(renderer);
}

SDL_DestroyRenderer(renderer);
SDL_DestroyWindow(window);
SDL_Quit();

exit(0);
}
slime73 commented 2 years ago

How does it perform if an explicit renderer backend isn't chosen? I think automatic batching is turned off by default when the SDL_HINT_RENDER_DRIVER hint is set, right?

(I'm not suggesting a 20x slowdown in this situation isn't a problem, I'm just wondering if the performance difference changes depending on batching settings. It might also be interesting to test different backends in general. I kind of suspect the old OpenGL lines code was hitting a driver path that was effectively automatically batched inside the driver code on certain drivers, which might not happen when an array of triangles is rendered.)

rtrussell commented 2 years ago

I think automatic batching is turned off by default when the SDL_HINT_RENDER_DRIVER hint is set, right?

Correct. If I add:

SDL_SetHint("SDL_RENDER_BATCHING", "1");

and a SDL_RenderFlush() before reading the finish time it improves performance in both cases, but the ratio is still about 20:1. This is a far greater difference than I would expect; can somebody else check in case my measurements are way off for some reason?

Edit: And if I remove the hints altogether (so the Direct3D backend is used) the ratio is similar.

1bsyl commented 2 years ago

Indeed, I also get 10ms before, and 200 ms after, with OpenGL if I switch to gles2, it's 70 ms

1bsyl commented 2 years ago

If I comment out what is in SDL_RENDERCMD_GEOMETRY for openGL: it is still 90ms (time in SDL_render.c, in QueueGeometry + set some state) for gles2: it is still 60s (time in SDL_render.c, in QueueGeometry + set some state)

so there are some time preparing the triangles data, which is bigger than just sending the points to the back-end. then in opengl, it requires many call:

          {       
1290                     size_t j;
1291                     float currentColor[4];
1292                     data->glGetFloatv(GL_CURRENT_COLOR, currentColor);
1293                     data->glBegin(GL_TRIANGLES);
1294                     for (j = 0; j < count; ++j)
1295                     {
1296                         const GLfloat x = *(verts++);
1297                         const GLfloat y = *(verts++);
1298                     
1299                         const GLfloat r = *(verts++);
1300                         const GLfloat g = *(verts++);
1301                         const GLfloat b = *(verts++);
1302                         const GLfloat a = *(verts++);
1303    
1304                         data->glColor4f(r, g, b, a);
1305                 
1306                         if (texture) {
1307                             GLfloat u = *(verts++);
1308                             GLfloat v = *(verts++);
1309                             data->glTexCoord2f(u,v);
1310                         }
1311                         data->glVertex2f(x, y);
1312                     }
1313                     data->glEnd();
1314                     data->glColor4f(currentColor[0], currentColor[1], currentColor[2], currentColor[3]);
1315                 }
1bsyl commented 2 years ago

compared to opengl drawline:

1261                 data->glBegin(GL_LINE_STRIP);
1262                 for (i = 0; i < count; ++i, verts += 2) {
1263                     data->glVertex2f(verts[0], verts[1]);
1264                 }
1265                 data->glEnd();
slime73 commented 2 years ago

Oh I (incorrectly) assumed the GL triangle rendering code was using glDrawArrays. I wonder if that would help at all. And maybe finding a way to remove the need for that glGetFloatv but I suspect that's not part of the bottleneck.

If you have access to a CPU profiler of any kind they can be really helpful for figuring out the best places to change stuff, too.

1bsyl commented 2 years ago

Yes gles2 is indeed using glDrawArrays, not opengl, maybe that would improve performance

1bsyl commented 2 years ago

I give a try and did a lot of copy paste from the GLES2 backend to add the VBO to opengl (see #5089) So now the lines with triangles draws at 75 ms (but the old GL_LINE version draws slower now also because we throw more code inside ...)

1bsyl commented 2 years ago

some double-check:
"but the old GL_LINE version draws slower now also because we throw more code inside ..."

Actually It became slower in 2.0.18 (eg 35 ms), because we call SDL_cosf / sinf / atanf, per point in GL_QueueDrawLines to fix this issue. So now, we are more: 35 ms with draw Line with GL_LINES, and 75 ms (with the VBO PR #5083), if we use GL_TRIANGLES

slime73 commented 2 years ago

I think OpenGL drivers tend to perform better with client-side vertex arrays (glVertexAttribPointer(..., directpointer)/glColorPointer/etc plus glDrawArrays/glDrawElements, with no vertex buffer object) than with VBOs+glBufferData/glBufferSubData, because the efficient ways to update VBOs are only part of OpenGL 4.4+.

slime73 commented 2 years ago

We'll probably get better RenderGeometry performance overall by passing the indices data through to the GPU via glDrawElements instead of manually expanding the triangle list on the CPU and using glDrawArrays, too, especially if vertex data copies can be eliminated.

0x1F9F1 commented 2 years ago

Something I'm struggling to understand is: What are the actual issues that OpenGL faces when drawing lines, compared to the other renderers?

If I remove the trig stuff added to the OpenGL renderer, it seems like the only issue I face (at least on my regular NVIDIA gpu) is with lines being one pixel too short, which is the same issue the other backends face (but all handle slightly differently for whatever reason): https://github.com/libsdl-org/SDL/blob/b7885abc449250fc733a756239d670394d01fe62/src/render/direct3d/SDL_render_d3d.c#L1215 https://github.com/libsdl-org/SDL/blob/b7885abc449250fc733a756239d670394d01fe62/src/render/direct3d11/SDL_render_d3d11.c#L2133 https://github.com/libsdl-org/SDL/blob/b7885abc449250fc733a756239d670394d01fe62/src/render/metal/SDL_render_metal.m#L1124 https://github.com/libsdl-org/SDL/blob/b7885abc449250fc733a756239d670394d01fe62/src/render/software/SDL_drawline.c#L201 It wouldn't surprise me if the other renderers (PSP, VitaGxm) also have the same behaviour, but just don't correct it.

This behaviour also corresponds to what the OpenGL spec expects, which states: "When pa and pb lie on fragment centers, this characterization of fragments reduces to Bresenham’s algorithm with one modification: lines produced in this description are “half-open,” meaning that the final fragment (corresponding to pb) is not drawn.".

The two main solutions to counteract half-open lines seems to be either to extend the line (Metal), or draw an extra point at the end (D3D, Software). If all renderers follow the same diamond-exit rule behaviour (which I suspect they do, apart from maybe the Software renderer), would it make more sense to just apply the half-pixel offset + line extension before calling QueueDrawLines?

See: https://www.khronos.org/registry/OpenGL/specs/gl/glspec46.core.pdf#subsection.14.5.1 https://microsoft.github.io/DirectX-Specs/d3d/archive/D3D11_3_FunctionalSpec.htm#3.4.3%20Aliased%20Line%20Rasterization%20Rules