floooh / sokol

minimal cross-platform standalone C headers
https://floooh.github.io/sokol-html5
zlib License
6.82k stars 475 forks source link

emscripten main loop hack #843

Closed voidware closed 6 months ago

voidware commented 1 year ago

Hi,

Just got the latest code. A few minor changes needed here, no problems. Seems working fine. thanks.

So;

I'd really like to eliminate my Sokol hacks in sokol_app.h so to use the official codebase. Firstly, thanks for fixing the emsc_event->code[0] keyboard problem. One less hack of mine needed!

This is an emscripten build and I thought I'd try to eliminate another hack. I've had this hack in my code for over a year and it works fine, but the official code just produces a blank screen with no errors.

_SOKOL_PRIVATE void _sapp_emsc_run(const sapp_desc* desc) {
...

    /* start the frame loop */

    // this is the sokol version 
    ///emscripten_request_animation_frame_loop(_sapp_emsc_frame, 0);

    // this is my version
    emscripten_set_main_loop(_sapp_emsc_frame_void, 0, true);
}

and;

_SOKOL_PRIVATE void _sapp_emsc_frame_void()
{
    // XX HACK!
    _sapp_emsc_frame(0, 0);
}

I'm at a loss to explain why the official main loop doesn't work and mine does because i can't see anything wrong with it.

Any ideas? Thanks.

floooh commented 1 year ago

Don't know tbh, I just tried the reverse by putting your code into sokol_app.h and that worked (even though it passes zero to the time parameter of the _sapp_emsc_frame() function (which I would expect to cause problems, e.g. divisions by zero).

Is your _sapp_emsc_frame() function otherwise identical? I would check next to put some printf() into the frame callback to check whether this is actually called for each frame.

PS: compiling and linking also works without problems? At first I had linker problems when adding emscripten_set_main_loop because of this problem in emsdk 3.1.38: https://github.com/emscripten-core/emscripten/issues/19363, I worked around by removing the NO_FILESYSTEM linker option.

voidware commented 1 year ago

Hi. Thanks for your reply.

Standard Sokol emsc flow, i get one frame called and that's it.

Here's what happens;

int main(int argc, char* argv[]) {
    Sp_desc desc = sokol_main(argc, argv);
    _sapp_emsc_run(&desc);
    return 0;
}

then;

_SOKOL_PRIVATE void _sapp_emsc_run(const sapp_desc* desc) {
...               
    emscripten_request_animation_frame_loop(_sapp_emsc_frame, 0);
}

This renders a single frame, and then returns from _sapp_emsc_run which returns to main, which return.

Then everything stops!

However

Doing this;

_SOKOL_PRIVATE void _sapp_emsc_run(const sapp_desc* desc) {
...               
    emscripten_set_main_loop(_sapp_emsc_frame_void, 0, true);
}

Does not return! and does not return to main.

So;

Is it something i'm doing that means returning to main halts everything (although it doesn't crash). or should there be something else stopping return to main ?

Also, FYI, I do have (and use) a filesystem which means i've removed the NO_FILESYSTEM link option.

floooh commented 1 year ago

Does your _sapp_emsc_frame() return EM_TRUE? E.g. it should look like this:

_SOKOL_PRIVATE EM_BOOL _sapp_emsc_frame(double time, void* userData) {
    // ....
    return EM_TRUE;
}

Looking at git-blame, the old frame function which used emscripten_set_main_loop didn't have a return value, but now with the request-animation-frame version, the return value indicates whether the frame loop should continue or not.

For instance see here: if _sapp.quit_ordered is set, the function will return EM_FALSE, and the loop stops:

https://github.com/floooh/sokol/blob/177e2c79bee74acebaeeb8d5b29f7f168858ff78/sokol_app.h#L5804-L5810

(...so the other thing to check would be if _sapp.quit_ordered or before that _sapp.quit_requested is set already in the first frame for some reason.

voidware commented 1 year ago

Thanks for the suggestion.

I tried that, but it wasn't the problem. I commented the whole thing out in sokol_app.h. No change

Then;

I had a go at building the sokol-sample html5/imgui-emsc.cc using the same build options I'm using for my app.

IMGDIR=b:/tools/imgui
SOKDIR=b:/tools/sokol
OPTS="-std=c++17 -Os -DNDEBUG"
EMS="-s USE_WEBGL2=1"
EMS+=" -s ASYNCIFY=1"
EMS+=" -s ALLOW_MEMORY_GROWTH=1"
LDFLAGS=" -lidbfs.js"
em++ imgui.cpp $1.cc -o web/index.html -I$IMGDIR -I$IMGDIR/misc/single_file -I$SOKDIR/util $OPTS -DSOKOL_IMPL -DSOKOL_GLES3 -I$SOKDIR $EMS -fno-rtti -fno-exceptions $LDFLAGS
cp shell-async.html web/index.html

I thought perhaps stuff like ASYNCIFY might be connected.

The imgui-emsc app works fine (as you'd expect).

So then i went in and changed the code of imgui-emsc to use sokol_app, adding sokol_main and so on.

Also works fine!

As you see from the script above, I'm pointing it to the same imgui and sokol I'm using for my app (any hacks included).

This is really weird; build my app and blank screen, build imgui-emscc, works fine!

Put my hack, emscripten_set_main_loop(_sapp_emsc_frame_void, 0, true) into sokol_app.h - both work fine!

Now I'm stumped.

floooh commented 1 year ago

@voidware any new info on this? Can the ticket be closed?

guillaumeblanc commented 11 months ago

Hi,

I'm glad to find an existing discussion on the topic, cos I'm also struggling to make sense of what I'm experimenting. I'll share my findings.

Using saap with SOKOL_NO_ENTRY, I noticed that saap_run returns (ie is not blocking) even before init cb is called. From my understanding, emscripten_request_animation_frame_loop is not a blocking function. It registers the emsc frame callback, and returns. The frame cb is called by the browser, later.

I think it works with sokol examples, because of the -NO_EXIT_RUNTIME=1 linker option, which means the program will continue to execute even after main (and saap_run) has exited.

emscripten_set_main_loopwith simulate_infinite_loop at true fixes the issue, which completely makes sense.

I couldn't make NO_EXIT_RUNTIME work, but that's another discussion. I'm not really happy with NO_EXIT_RUNTIME anyway, as I have objects on the stack (thanks to SOKOL_NO_ENTRY) I need to keep alive.

I'll test emscripten_set_main_loop, thanks for the tips.

Hope it helps, Guillaume

floooh commented 11 months ago

Hmm, looking at https://github.com/emscripten-core/emscripten/blob/c65a321b603de497aa0168f542a88e95e88264a4/src/settings.js#L83-L93, NO_EXIT_RUNTIME=1 is actually the default setting (because it's derived from EXIT_RUNTIME setting, the NO_ is just the inversion. I think I'll go ahead and just remove the NO_EXIT_RUNTIME setting from fips, since it is redundant.

I can confirm that the sokol samples have trouble when the exit runtime is enabled though (most likely because the samples keep state in global variables and this might be zeroed out when main() returns and the exit runtime runs.

The behaviour that the main() function returns immediately is entirely normal on Emscripten, otherwise the browser event loop would appear to be stuck and the tab would be killed after a few seconds (unless tricks like ASYNCIFY are used), and from that perspective it's also correct that the exit runtime is disabled by default, because global state must stay alive after main() returns.

Also any actions on exit don't make a lot of sense anyway when running in the browser, because the user can close the tab at any time, and this will just stop the application without it having a chance to run any "exit code" (there's a beforeunload event, but this isn't reliable (see https://developer.mozilla.org/en-US/docs/Web/API/Window/beforeunload_event#usage_notes) - it's only useful for the "Leave site?" popup to prevent unsaved data loss.

guillaumeblanc commented 11 months ago

Hi,

Removing NO_EXIT_RUNTIME (as it's the default behavior) is a good idea indeed.

I'm starting understanding my main blocker: -sASSERTIONS=1 seems to force EXIT_RUNTIME (at least throw assertions when crt functions are called after exit). For now I need to force -sASSERTIONS=0 even in debug.

The behaviour that the main() function returns immediately is entirely normal on Emscripten

This is not what I was expecting after reading this comment about sapp_run : https://github.com/floooh/sokol/blob/master/sokol_app.h#L970

Also any actions on exit don't make a lot of sense anyway

The code executed after sapp_run is actually destructors from objects on the stack, which works ok on other platforms.

Cheers, Guillaume

floooh commented 11 months ago

This is not what I was expecting after reading this comment about sapp_run :

Ooops, yeah that comment is most likely a leftover from when sokol_app.h was using emscripten_set_main_loop() with the simulate_infinite_loop arg. This is throwing a JS exception in order to exit the emscripten_set_main_loop() function so that execution is passed back to the browser, but without crawling back up the C call stack.

which works ok on other platforms.

Be aware though that on iOS, it also isn't guaranteed that exit code is called. There are situations where the OS will simply terminate the application (even without calling the iOS specific applicationWillTerminate method).

floooh commented 11 months ago

@guillaumeblanc I have clarified sapp_run() behaviour in that outdated comment section:

https://github.com/floooh/sokol/blob/d10614a5a25ba934b4cc3f4b1e43498931df89a4/sokol_app.h#L952-L993

...I also wonder if it would make sense to add a config-boolean to sapp_desc to select between emscripten_request_animation_frame_loop() and emscripten_set_main_loop() with the simulate_infinite_loop enabled, the latter would prevent any code being executed after sapp_run() (and thus no destructors or exit runtime being called).

guillaumeblanc commented 11 months ago

Hi,

Fixing the documentation was the most important part. It's pretty clear now.

Why do you say that "An application also should not depend on the cleanup-callback being called" ? I didn't experience that. It becomes pretty difficult to implement a portable solution if it's the case.

Using emscripten_set_main_loop would make emscripten behavior closer to other platforms, with 2 code paths instead of 3. Why did you switch from emscripten_set_main_loop to emscripten_request_animation_frame_loop ?

Thanks a lot !! Guillaume

floooh commented 11 months ago

An application also should not depend on the cleanup-callback being called

There's (at least) two platforms where this can happen:

In general, you should put your cleanup code into the sokol_app.h cleanup callback, because that is the most portable solution. sokol_app.h just cannot guarantee that the cleanup function will be called on all platforms (generally in cases where the 'system' simply terminates the application, without giving the application an opportunity to react).

As for why emscripten_request_animation_loop: IIRC the main reason was because the frame callback of that function gives me the timestamp of the current frame which I'm directly using as input to the frame duration computation.

(note that emscripten_set_main_loop() doesn't change anything about cleanup code, with the simulate_infinite_loop param it will simply never call any code following the call - its only purpose is to avoid that any cleanup or exit-runtime code is accidentally called at startup).

guillaumeblanc commented 11 months ago

All clear.

I agree that cleanup cb is the most portable solution, I switched to it.

If you ask me, I'd switch to emscripten_set_main_loop without adding a new option. It would make sapp_run behavior closer to other platforms, the documentation simpler and the api less error-prone. You can get timestamp with performance.now() I suppose.

Hope it helps. Thanks a lot ! Guillaume

floooh commented 11 months ago

My hope with getting the timestamp from the requestAnimationFrame() callback is that it contains less jitter then calling performance.now() at the start of my own callback. Ideally, the rAF timestamp would be the time of when the last frame was presented, similar to native swapchain APIs, but that may be a little much to hope for.

Also, there this an interesting property of the timestamp that even though it has limited precision on some browsers (e.g. on Safari it's only 1ms granularity), taking the average over a couple of frames gives an exact result (e.g. 16.667ms on 60Hz displays or 8.333ms on 120Hz displays), I don't know if this is an intended feature of the RAF timestamp, or just the regular sideeffect of filtering a noisy random signal though (but since Safari only returns full milliseconds, e.g. 16.0 or 17.0, but simply rounding a jittery signal would always tend towards 17.0, I'm thinking it's intentional.

Dvad commented 6 months ago

My experience on this topic:

I am using the PROXY_TO_PTHREAD option which is usually recommended by emscripten: https://emscripten.org/docs/porting/pthreads.html#blocking-on-the-main-browser-thread Does anybody have any experience with this option?

I can not make the current main loop setup (using emscripten_request_animation_frame_loop) work with that option on. Whatever combination of EXIT_RUNTIME and other option I use. Switching to emscripten_set_main_loop() works well.

To add to the discussion about the timestamp. From: https://emscripten.org/docs/api_reference/html5.h.html#c.emscripten_request_animation_frame_loop it is specified that the time passed to the callback is actually performance.now() called just before calling the callback. Does that imply there should not be much more jitter with the set_main_loop + performance.now() approach?

On the other end MDN states hand wavily:

The timestamp value is also similar to calling performance.now() at the start of the callback function, but it is never the same value.

I wonder if the jitter is something that could be measured.

floooh commented 6 months ago

My current preferred solution would be to add a new config bool to sapp_desc called html5_use_emscripten_set_main_loop which would then switch to a different main loop code path, but keep the default behaviour on emscripten_request_animation_frame(). I can try to squeeze this in after the big sokol-gfx render pass cleanup is merged, or if anybody wants to pick this up as PR I'd be fine with that too.

Here's a different PR for reference (which is missing the selection between rAF and set_main_loop though):

https://github.com/floooh/sokol/pull/915

voidware commented 6 months ago

option on sapp_desc;

Yes please. That would be splendid. Thanks.

Dvad commented 6 months ago

What about something like this? https://github.com/floooh/sokol/pull/997

Edit: Oops, I just realized you were talking about a runtime bool and not compiled time option as in https://github.com/floooh/sokol/pull/997. I wonder if the ability to choose at runtime option is useful though. Specially since some of the other compile time options can make the whole app freeze if not using set_main_loop .

floooh commented 6 months ago

@Dvad I would really prefer a runtime bool because it keeps the number of builds down (e.g. I can add a sample which uses set-main-loop method without having to mess around in the build system and/or build a separate sokol library).

floooh commented 6 months ago

Ok, PR https://github.com/floooh/sokol/pull/997 has been merged. I also added a separate config option for the simulate_infinite_loop arg to sapp_desc (not sure if it's useful but in any case it doesn't hurt).

voidware commented 6 months ago

Thanks for this. I can report everything is working here spiffingly.

Also the other breaking changes i needed to fix. Thanks for the excellent docs on migration. All good.