pharo-project / pharo

Pharo is a dynamic reflective pure object-oriented language supporting live programming inspired by Smalltalk.
http://pharo.org
Other
1.21k stars 355 forks source link

MacOS Monterey: On startup, image is unresponsive #10981

Closed tinchodias closed 1 year ago

tinchodias commented 2 years ago

Bug description

The image doesn't react to left clicks and window title bar is grayed, as if the OS window didn't grab focus. This is fixed when you switch to other MacOS window and come back to Pharo.

According to chats, this only happens in MacOS Monterey.

To Reproduce

Version information:

tinchodias commented 2 years ago

A solution / workaround was to add a 20ms wait before return statement in OSSDL2Driver>>#createWindowWithAttributes:osWindow:

tinchodias commented 2 years ago

I don't have a very good explanation yet for the delay, but I observed that

tesonep commented 2 years ago

@tinchodias Do you know which are the 5 events?

tinchodias commented 2 years ago

@tesonep

They are:

a SDL_CommonEvent type: 4352
a SDL_CommonEvent type: 4352
a SDL_CommonEvent type: 4352
a SDL_CommonEvent type: 4352
a SDL_WindowEvent type: 512 windowID: 1
a SDL_WindowEvent type: 512 windowID: 1
a SDL_WindowEvent type: 512 windowID: 1

Logged by inserting this code:

[ 
        | event |
        event := SDL_Event new. 
        [ (SDL2 pollEvent: event) > 0 ] whileTrue: [
            Stdio stdout print: event mapped; lf.
        ]
    ] value.

in line 40 of OSSDL2Driver>>#createWindowWithAttributes:osWindow:

tinchodias commented 2 years ago

The "mapped event" is then visited and in general converted to a OSWindow event and delivered. I don't know to what are converted these SDL_CommonEvent and SDL_WindowEvent

tinchodias commented 2 years ago

Browsing a bit more the code, I realized that the "mapped event ", when it doesn't have a windowId, it is forwarded to OSSDL2Driver>>#sendEventWithoutWindow:, and it will visit and convert it again. And that converted event can be nil and then it's ignored. This is was happens with the first 4 polled events. To check it, I extended my print to:

    [ 
        | event |
        event := SDL_Event new. 
        [ (SDL2 pollEvent: event) > 0 ] whileTrue: [
            | mappedEvent convertedEvent |
            mappedEvent := event mapped.
            mappedEvent windowID ifNil:[
                convertedEvent := mappedEvent accept: OSSDL2Driver current ].

            Stdio stdout
                print: mappedEvent;
                nextPutAll: ';';
                print: mappedEvent isUserInterrupt;
                nextPutAll: ';';
                print: mappedEvent isUserInterruptKillAll;
                nextPutAll: ';';
                print: mappedEvent windowID;
                nextPutAll: ';';
                print: convertedEvent;
                nextPutAll: ';';
                lf.
        ]
    ] value.

And get:

a SDL_CommonEvent type: 4352;false;false;nil;nil;
a SDL_CommonEvent type: 4352;false;false;nil;nil;
a SDL_CommonEvent type: 4352;false;false;nil;nil;
a SDL_CommonEvent type: 4352;false;false;nil;nil;
a SDL_WindowEvent type: 512 windowID: 1;false;false;1;nil;
a SDL_WindowEvent type: 512 windowID: 1;false;false;1;nil;
a SDL_WindowEvent type: 512 windowID: 1;false;false;1;nil;
tinchodias commented 2 years ago

Searching for "SDL_Event 4352", I found an issue from 2019, where the developer fixed the issue calling a custom SDL "flush_event_queue" after window creation.

https://stackoverflow.com/questions/57051494/sdl-setwindowsize-resizes-window-but-sdl-getwindowsize-reports-old-size-in-u

Stack Overflow
SDL_SetWindowSize resizes window, but SDL_GetWindowSize reports old size -- in Unix
I can't use SDL_GetWindowSize cross-platform, because in Unix it doesn't know that the window size changed (though I can see it did). Visual Studio and MinGW have no problem. (I keep running into c...
tinchodias commented 2 years ago

If I understand well, 4352 hex ">> '16r1100'" then these 4 mysterious events are:

    SDL_AUDIODEVICEADDED = 0x1100, /**< A new audio device is available */

source: https://github.com/libsdl-org/SDL/blob/main/include/SDL_events.h#L154

no idea why would we receive 4 of these, why that could cause and issue with focus, and why polling them early would fix the issue.

GitHub
SDL/SDL_events.h at main · libsdl-org/SDL
Simple Directmedia Layer. Contribute to libsdl-org/SDL development by creating an account on GitHub.
tinchodias commented 2 years ago

I didn't know what is the priority of the process the creates the SDL window, it's 79. The SDL2 loop has 60. (Printed with Stdio stdout print: Processor activeProcess priority; lf.)

I understand that the 20ms wait on the 79 priority process on startup let's place to the SDL2 loop which consumes the 0x1100 event doing nothing (it's ignored), that 's no problematic.

Just to mention it, @tesonep, no idea if important info.

BTW, I wonder if my windows and linux also have these events.

MarcusDenker commented 2 years ago

What is the plan here, do we add the workaround as a first step?

akevalion commented 2 years ago

Hello, I have mac monterrey, and I do not have that problem

MarcusDenker commented 2 years ago

It happens for me sometimes, I just to "CMD-Tab" and I can work... but for someone who does not know it makes the system unusable.

tinchodias commented 2 years ago

Okay, I'm preparing a PR

yannij commented 2 years ago

After upgrading to MacOS 12.1, I started to see the problem. I use the same PharoVM and Pharo9 image before and after the OS upgrade (Intel CPU). I've just read the workaround (to shift focus to another window, then back to Pharo), and it worked for me just now. My previous workaround, which worked 98% of the time (except today, when I used the newly discovered workaround), was to let the Pharo image open on the laptop window. Then click on a browser window. Now it is safe to move the Pharo window to a HDMI connected monitor. I used to move the window immediately to the HDMI monitor, and mouse clicks would fail to be recognized more than half the time.

tinchodias commented 2 years ago

@yannij Have you tried to 20ms wait I proposed in the PR? @MarcusDenker I created the PR @akevalion If you want to try the PR, anyway, it can help to confirm it doesn't hurt!

yannij commented 2 years ago

The 20ms wait worked. I was able to drag the window to the HDMI monitor even before the first redraw, and the click worked fine. I'd assumed that the workaround was a VM change. Being able to change this in the image is good.

tesonep commented 2 years ago

I have been checking, and those events as Martin said are sent when a new audio device is connected, in Monterrey looks like the OS is sending those events when the APP is launched. I think the wait could be a good work around.

yannij commented 2 years ago

My image started today, and had the clicks being lost again. That's about 1 bad in four image restarts. I used the switch OS window, and back, to gain back the click function.

tinchodias commented 2 years ago

@yannij thank you for the report. Maybe some milliseconds more were needed?

tinchodias commented 1 year ago

I supper again this problem, maybe 100% of the times. I'll pay more attention to check. Somebody else? I'm on Macos Ventura, Macbook Pro 2018.

tinchodias commented 1 year ago

It seems that raising the workaround wait from 20ms to 30ms already makes it disappear...

tinchodias commented 1 year ago

One year ayer my report, I can reproduce this bug with the same methodology. Mac Ventura 13.1.

BUT! this time I have another solution to propose, slightly better... at least without any magic number of milliseconds hardcoded: In OSWorldRenderer>>#doActivate, do not tell the window to focus immediately, but in a forked process ([osWindow focus] fork.).

Without success, I tried to find the real cause of the problem, I've improved event logging and browsed the SDL2 codebase, and forums. I should try with a program in plain C to reproduce it minimally, and then report it as SDL2 bug.

tinchodias commented 1 year ago

When I log all SDL2 events polled from the main loop, the image starts with:

a SDL_WindowEventID(#SDL_WINDOWEVENT_MOVED)
a SDL_WindowEventID(#SDL_WINDOWEVENT_SHOWN)
a SDL_WindowEventID(#SDL_WINDOWEVENT_EXPOSED)
a SDL_WindowEventID(#SDL_WINDOWEVENT_FOCUS_GAINED)    <-- 
a SDL_WindowEventID(#SDL_WINDOWEVENT_ENTER)

But the SDL_WINDOWEVENT_FOCUS_GAINED doesn't happen when the image started unresponsive (you move the mouse during startup).

Second, when the image started ok, you move the cursor to another window and click and see in the log:

SDL_WindowEvent(#SDL_WINDOWEVENT_LEAVE)
SDL_WindowEvent(#SDL_WINDOWEVENT_FOCUS_LOST)

But when the image started unresponsive, you don't see SDL_WINDOWEVENT_FOCUS_LOST... it's like if it never gained the focus.

That's why I tried postponing the request to grab the focus, and it worked, on my machine at least.

I can create a PR tomorrow.

tinchodias commented 1 year ago

I'm preparing the PR.

BTW, this new process to focus is forked during the start up, when UIManagerSessionHandler>>#startup: sends as argument a MorphicUIManager to UIManager class>>default:, which activates it. Maybe some expert on the start up process has a better idea on how to postpone the focus.

tinchodias commented 1 year ago

For the record, the most close SDL2 issue I've found is this for X11:

2013 - bug report: https://stackoverflow.com/questions/18234136/focusin-focusout-not-generated 2014 - bug report: https://stackoverflow.com/questions/26863470/sdl2-input-focus?rq=1 2015 - fix: https://github.com/libsdl-org/SDL/commit/f001a00b081e864fd2e7eef1e6f7313b9db916e5

Stack Overflow
FocusIn/FocusOut not generated
Can someone please clarify the default focus handling of the X11 server? My understanding is that the focus 'follows the mouse' and sure enough if I move the mouse between separate terminals I can ...
Stack Overflow
SDL2 input focus
I'm currently trying to build and run an UnrealEngine4 demo app, which uses SDL2, on a Linux armv7 embedded system with X server running but no window manager. What I'm seeing is that the app is not
tinchodias commented 1 year ago

Now I did another guess that's even simpler, and it fixed it on my machine: move the osWindow focus. expression to the end of OSWorldRenderer>>#doActivate (no need of forking...). I'll update the PR so you can check.

BTW, osWindow focus is just a FFI call to SDL_RaiseWindow.