openfl / lime

A foundational Haxe framework for cross-platform development
https://lime.openfl.org/
MIT License
761 stars 375 forks source link

App stops responding when calling FileDialog.save() when targeting HashLink on Windows (ThreadPool vs BackgroundWorker) #1849

Closed joshtynjala closed 1 month ago

joshtynjala commented 1 month ago

Code to reproduce:

import lime.ui.FileDialog;

class Main {
    public static function main():Void {
        var dialog = new FileDialog();
        dialog.browse(SAVE);
    }
}

It doesn't reproduce in Lime 8.1. Similarly, if I change FileDialog to revert back to using BackgroundWorker instead of ThreadPool, the issue goes away. So it seems like an issue with ThreadPool in Lime 8.2.

It does not seem to reproduce when targeting HashLink on macOS. Just Windows (didn't test Linux, though).

Related commit by @player-03: openfl/lime@d4588434b9c9477f8c5c0cfd5f6ec8610761b31f

player-03 commented 1 month ago

Can't reproduce it on Linux either.

Let's try with simpler code and some trace statements. Does it still happen here, and if so, what's the last thing it traces?

import lime._internal.backend.native.NativeCFFI;
import lime.system.CFFI;
import lime.system.ThreadPool;

class Main {
    public static function main():Void {
        var worker = new ThreadPool();
        worker.maxThreads = 1;
        worker.onComplete.add(path -> trace(path));

        worker.run(function(_, worker) {
            trace("Showing first dialog");
            var result = CFFI.stringValue(NativeCFFI.lime_file_dialog_save_file(null, null, null));
            trace("First dialog done");
            worker.sendComplete(result);
        });

        worker.run(function(_, __) {
            trace("Showing second dialog");
            var result = CFFI.stringValue(NativeCFFI.lime_file_dialog_save_file(null, null, null));
            trace("Second dialog done");
            worker.sendComplete(result);
        });
    }
}
joshtynjala commented 1 month ago

@player-03 I see one trace:

src/Main.hx:19: Showing second dialog

Then, after maybe 5-10 seconds, the dialog stops responding. However, for short time, I can move the dialog and interact with it.

Interestingly, the other day, it stopped responding almost instantly. Today, with both my original code and your new code, it takes several seconds before it stops responding.

player-03 commented 1 month ago

I have no clue how it's getting to the second dialog before the first, but I'm pretty confident that's a red herring. Look at this:

https://github.com/openfl/lime/blob/5c8538efcb52d44fd6fc00803847ef70b9c9ab92/src/lime/system/BackgroundWorker.hx#L105-L117

I'd completely forgotten this, but the 8.1.0 version of BackgroundWorker only uses threads in C++ and Neko. That's probably why it's not freezing in HL.

Next step: add hl to all those #if (cpp || neko) statements, and see if it freezes then.

joshtynjala commented 1 month ago

Next step: add hl to all those #if (cpp || neko) statements, and see if it freezes then.

If I make FileDialog use BackgroundWorker and add || hl, it does indeed freeze.

player-03 commented 1 month ago

So this was hidden from us all along.

joshtynjala commented 1 month ago

We could use new ThreadPool(#if (windows && hl) SINGLE_THREADED #end) to patch this one specific case.

Since I'm hoping to do a release in the next week or two, this might be our best option for now. I'll give it a try, and if it works, I'll commit it.

Maybe calling UI code from background threads is a bad idea in general, and we should remove the references to ThreadPool.

I tried searching HashLink commits on GitHub that involve threads, and it seems that there are several thread-related fixes in HL 1.14. We're still bundling HL 1.12 in Lime. So I might try testing HL 1.14 to see if there's any difference. If we discover that HL 1.14 fixes it, at least we'll know that we can remove our workaround in the future.

joshtynjala commented 1 month ago

We could use new ThreadPool(#if (windows && hl) SINGLE_THREADED #end) to patch this one specific case.

Confirmed that this works as a workaround. Committed and pushed to develop.

So I might try testing HL 1.14 to see if there's any difference.

Unfortunately, HashLink 1.14 still stopped responding for me. It seemed like it may have taken a bit longer to happen, though.