Closed elmarco closed 3 years ago
I think I've got a theory of how this could possibly happen: it's a reentrancy issue.
do_set_selection()
calls wl_display_roundtrip()
right before destroying the popup surface:
That will, of course, cause pending events to be dispatched. If one of those events is another enter event for the same keyboard, do_set_selection()
will get invoked again, set the selection again, destroy the surface, and exit. Then the outer do_set_selection()
resumes only to find that the popup surface has disappeared from under it.
The fix for that would be to clear self->popup_surface->on_focus
before the roundtrip.
Is there any way we would be able to learn whether that indeed fixes the crashes? At the retrace.fedoraproject.org page you've linked, it says there were not occurrences of this reported since 2021-03-25?
I went ahead and pushed a two-part fix. The second part is tweaking the check to if (self->popup_surface != NULL)
, which is imperfect because it silences the issue (I'd much rather learn about unexpected scenarios like this one from bug reports and fix them), but at least it guarantees this crash won't happen for the users whether it was caused by this reentrancy issue or not.
No way to check if this indeed fixes it but to upload a new version, as we don't have a reproducer. I am not sure why retrace.fedoraproject.org says the last occurence is 03-25, since the weekly/monthly charts shows more recent occurences (although none for the past 2 months or so). Are you planning a release? Otherwise, I can backport the patches and submit to Fedora.
I am, in fact, planning a 2.1 release, since there are enough fixes and improvements for a minor release. But I want to figure out the wl-clipboard-x11 story before that. So that may take a few days or few months depending on how much I procrastinate and how relatively interesting other projects are 🙂
As for whether you should ship the fix (or really, the current master branch) in Fedora without waiting for an official release, it's up to you and Fedora policies. But I don't expect there to be any breakage since the changes from 2.0 were all very small, and it would be good to get some more pre-release testing this way.
On the other hand, you'd do duplicate work if the release actually happens soon, and it may not be that great to do pre-release testing on unsuspecting users 🙂 So, again, up to you & Fedora policies.
There are regular report of wl-copy crashes in Fedora:
https://retrace.fedoraproject.org/faf/problems/bthash/?bth=a7b6a74e0bf37dfda7981e7a227426639637e171&bth=8cac417a30fc6dfa56d448f4342d75aa3cf46b2d
I suppose
popup_surface_destroy()
is being called with a non-initialized/NULLpopup_surface
https://bugzilla.redhat.com/show_bug.cgi?id=1910755