adamdruppe / arsd

This is a collection of modules that I've released over the years. Most of them stand alone, or have just one or two dependencies in here, so you don't have to download this whole repo.
http://arsd-official.dpldocs.info/arsd.html
530 stars 127 forks source link

simpledisplay + core.thread.Fiber + text = segfault #280

Closed quickfur closed 3 years ago

quickfur commented 3 years ago

Spent all day today trying to trace a segfault, turns out that simpledisplay for some reason is incompatible with core.thread.Fiber. Here's a reduced failing case:

import core.thread : Fiber;
import arsd.simpledisplay;

void drawChar(ScreenPainter d, dchar ch)
{
    import std.conv : to;

    d.pen = Pen(Color.black);
    d.drawRectangle(Point(400, 300), 15, 15);

    version(noText) {} else
    {
        d.pen = Pen(Color.yellow, 2);
        d.drawText(Point(400, 300), ch.to!string);
    }
}

void main()
{
    auto window = new SimpleWindow(Size(800, 600), "TEST");

    version(useFiber)
    {
        dchar key;
        auto fiber = new Fiber({
            for (;;)
            {
                drawChar(window.draw, key);
                Fiber.yield();
            }
        });
    }

    window.draw.clear();
    window.eventLoop(0,
        delegate(dchar ch) {
            if (ch == 'q')
                window.close();
            else
            {
                version(useFiber)
                {
                    key = ch;
                    fiber.call();
                }
                else
                    drawChar(window.draw, ch);
            }
        },
    );
}

Compile without any -versions to see the expected behaviour.

When compiled with -version=useFiber on Linux, I get a segfault. The segfault is caused by the call to drawText, as proven by the crash going away when compiled with -version=noText. Traced the problem to somewhere in the call to Xutf8TextExtents in NativeScreenPainterImplementation.textWidth.

In my non-reduced code, I also get a segfault somewhere in the font-handling code, but in a different place. No idea why using Fibers causes it to fail; moving the Fiber code outside the Fiber makes the problem go away. Very strange.

This is a major showstopper for me, as I'm trying to integrate simpledisplay with one of my Fiber-heavy projects.

adamdruppe commented 3 years ago

Without seriously looking at it, you are probably just running out of stack space. The default stack size arg in Fiber's constructor is small, like 4 KB. I frequently assume char[4096] buffer = void; is going to work fine. Bit me a few other times with fibers too, it'll run out of space very quickly.

To prove this try passing a larger number to the Fiber constructor. Like 40000 or something. If it works then it prolly proves this. Maybe I can tweak the stack buffer sizes in there if so.

quickfur commented 3 years ago

Whoa you're right!! Increasing the Fiber stack size to 32KB makes this particular problem go away. Probably need a bit more for my real project. :D Thanks a ton for the quick response!

P.S. I tried 16KB but apparently that's not enough. You got some pretty big buffers going! :-P

adamdruppe commented 3 years ago

So looking at the source it doesn't look like it is one of my buffers.... the Windows version has a buffer for utf16 conversion but the Linux version just calls into that Xutf8yadayada function. So it must be something inside that implementation and I don't think I can change that......... (of course I could just not use that function but like that's a lot of extra work)

so yeah i can at least add a note to the docs to warn ppl

quickfur commented 3 years ago

Yeah a note in the docs would be good.

Now I'm running into a different problem, but it's probably my own fault. Was trying to allocate ScreenPainter on the heap and manage it manually, probably a bad idea. :-D

adamdruppe commented 3 years ago

Yeah, the painter is a refcounted thing internally with this(this) and a ~this to actually blit from buffer to screen, so best to just use it like that. I'm not even sure I expose the functions to manage it without invoking that destructor...

quickfur commented 3 years ago

I was trying to be sneaky and doing d = new ScreenPainter; *d = window.draw;, but it probably screws up the dtor in a major way. :-D I'll just have to keep an on-stack instance somewhere up the call stack and use a temporary copy where I can't do that. Probably no biggie.

adamdruppe commented 3 years ago

what are you trying to do anyway? the painter is actually pretty small memory wise...

quickfur commented 3 years ago

I'm probably going about it completely backwards, but basically, I have a bunch of code running in a Fiber that needs to output stuff to the screen. It was written in a way that sends a string to the backend code piecemeal (usually single characters, but it's on my TODO list to fix this), and at the end calls a flush method to draw it all on the screen. So basically, I'd like to reuse the same ScreenPainter to accumulate all of these draws and only flush them at the end when flush is called.

Rewriting this to have implicit flush when ScreenPainter goes out of scope pretty much will require rewriting the entire code. :-D Likely not a bad thing but I was hoping to just get things working with a few shunts here and there, not rewrite everything from scratch.

adamdruppe commented 3 years ago

Well maybe some magic with .destroy(*painter) ... idk. if you're just printing chars you could just accumulate them in a buffer then just draw when the flush is called. buffering the strings will give better reslts anyway than trying to draw piecemeal

quickfur commented 3 years ago

yeah that makes sense... the old code actually already uses a buffer... but it spools the buffer to the backend char-by-char, which is kinda stupid. I should short-circuit that. :-D

I did try it with .destroy(*painter) but it was messy. But might reach for that as a last resort, if all else fails.

quickfur commented 3 years ago

After the Fiber stack size tweak, this issue seems to have been fixed now.

Actually, there were other issues with introducing another Fiber, so I switched to threads instead. It's actually possible to cache painters and reuse them between GUI API calls, flushing when the flush method is called, so that part wasn't the real problem, it was something else in my code that I messed up.

So all in all, this issue can be closed.