Closed Drakulix closed 3 months ago
And of course just as I am about to sign off, I notice that xdg-popups (and it looks like override-redirect X11 window as well) at least don't render correctly. Another thing for the todo list for tomorrow.
Git history cleaned up as good as possible.
Almost all deadlocks so far are in parts of the code that try to lock the shell and also either access the pointer or keyboard handles, which use locks internally. Conceptionally seats are part of the shell, but since they need the full state accessible for grabs, they are frequently cloned out of the shell, which can lead to cases, where both locks are taken in a different order by a surface thread and the main thread, locking both indefinitely (and thus also all other surface-threads).
Even without a change like this, Smithay has had several issues with deadlocks in the KeyboardHandle
. So some improvement to how locks are used there could definitely be welcome.
Using cosmic-workspaces on this branch seems to result in an unresponsive desktop other than cursor movement (can't interact with or close cosmic-workspaces, and opening a terminal doesn't show any change either).
Using cosmic-workspaces on this branch seems to result in an unresponsive desktop other than cursor movement (can't interact with or close cosmic-workspaces, and opening a terminal doesn't show any change either).
Should be fixed by latest commit.
I've been trying this for a couple hours and I have found it to crash a fair amount. I did provide logs but Im not sure they show anything
ran cosmic-comp > /tmp/cosmic-comp-stdout.txt 2> /tmp/cosmic-comp-stderr.txt
directly from cli and these are the logs from it
A) first issues i've been having are crashes when using gamescope to play games, cosmic-comp-stdout.txt cosmic-comp-stderr.txt
B) cosmic-comp will sometimes fail to load outright, causing a locked up black screen that requires the user to turn the keyboard into RAW
mode using sysrq, then swap tty, kill the server, and try again, no logs get spat out (and it doesnt really happen when redirecting the output)
C) Displays would sometimes act extremely strange, wrong resolution and location cosmic-comp-stdout.txt cosmic-comp-stderr.txt
D) crash when using wlr to toggle second display off and on cosmic-comp-stdout.txt cosmic-comp-stderr.txt
Appreciating the detailed feedback, but no the logs aren't super helpful.
For one, for these experimental branches it is worth enabling debug logging, by increasing the default log level to debug: https://github.com/pop-os/cosmic-comp/blob/master/Cargo.toml#L49 (release_max_level_debug
).
And to capture crashes you also want to set RUST_BACKTRACE=1
.
(You probably also want to do debug builds to get better backtraces, though that might impact performance quite a bit. As a workaround we have a fastdebug
profile, so building with --profile fastdebug
is also an option.)
A) first issues i've been having are crashes when using gamescope to play games
Looking into it.
B) cosmic-comp will sometimes fail to load outright, causing a locked up black screen that requires the user to turn the keyboard into
RAW
mode using sysrq, then swap tty, kill the server, and try again, no logs get spat out (and it doesnt really happen when redirecting the output)
I am testing this branch on my dev system for a couple of days and have definitely noticed this. In my testing it happened more consistently, if I had multiple monitors attached during launch, can you confirm this?
C) Displays would sometimes act extremely strange, wrong resolution and location
Could you elaborate this more with some examples? It might also be worth while to upload your output-configuration from .local/state/cosmic-comp/outputs.ron
and compare the current state with the config file and what you would expect.
D) crash when using wlr to toggle second display off and on.
Great, this should be easy to reproduce, looking into it.
Also I am pretty sure there is a memory-leak somewhere causing crashes after just running this branch for a while.
can't recompile atm but will when I can
when using gamescope getting the crashes only occurred when actually resizing gamescope I did go and verify that it was indeed stable enough when just playing but repeatedly resizing the window (I was able to replicate it using multiple wev windows for tiling to occur) I was playing genshin impact when it occured, I will try other applications when I get the chance.
as for the strange resolution issues, I was actually able to replicate it by just running cosmic-comp
on old versions too, they stopped occurring when I used cosmic-settings to reset the resolution, the error never occurred when using cosmic-session to start so I was mistaken about the cause.
Testing this branch on one computer, I noticed drag and drop surfaces not showing (in cosmic-workspaces, nautilus, etc.).
Testing this branch on one computer, I noticed drag and drop surfaces not showing (in cosmic-workspaces, nautilus, etc.).
fixed.
Looking at 845e376d3df982032ded363f1140f0470663c6aa, the way insert
/insert_if_missing
doesn't make data visible in other threads does seem error prone (not that that's a new issue)...
I guess we're still looking for a better way to manage data like that in Smithay and wayland-rs.
With the 9fc1372 build, I was seeing issues when interacting with tabs in Firefox (from https://github.com/pop-os/packaging-firefox/commit/5a2511492b335bfce0b827104e061f5d88027201).
The behaviour seen was as follows:
The behaviour seemed to be triggered at anytime tabs were interacted with: switching tabs, closing tabs, looking through all the tabs in the drop-down that appears when tabs over flow.
In terms of hardware, I'm using a Pang12. The system has Ryzen 7 6800U with Radeon 680M integrated graphics. The laptop did not have any external displays, and was running cosmic-term, virtual machine manager (without any active VMs), and slack alongside Firefox. I am using tiling, and I am using stacks.
With da1f597, attempting to switch to a virtual console and back to a running cosmic-comp session, I am met with a black screen. cosmic-comp itself is still running. This compares with 254c583 where I can successfully cycle through virtual consoles and return to the composited session without issue.
Here is an unreasonable large PR, that really got out of hand.
Depends on https://github.com/Smithay/smithay/pull/1393
The main purpose of all these changes is to render each output on a separate thread. However it does a lot more (list in no particular order or claim of completeness):
This does run unreasonably stable for the amount of testing done, but is absolutely not recommended for production. I think I catched the most egregious deadlocks, but expect crashes and deadlocks. For some reason Xwayland seems to have even more problems with this build in my testing.
Almost all deadlocks so far are in parts of the code that try to lock the
shell
and also either access the pointer or keyboard handles, which use locks internally. Conceptionally seats are part of the shell, but since they need the full state accessible for grabs, they are frequently cloned out of the shell, which can lead to cases, where both locks are taken in a different order by a surface thread and the main thread, locking both indefinitely (and thus also all other surface-threads).While I have been able to work around most of these cases with relative ease, I am not convinced yet, that this is a problem worth ignoring. I made attempts to fix this in smithay (and be able to pass a subtype of
State
to the input-logic), but this spawned a new discussion on structuring wayland-rs. If this problem remains it might introduce unreasonable debugging-churn for development, as deadlocks might be frequently accidentally introduced while developing. I don't know how we can figure this out without just giving it a shot.Another solution to bypass this problem rather than fixing this in smithay would be to keep generation of
Elements
for rendering in the main thread (and still drive the whole scheduling logic from the main thread) and then pass those off to a surface-specific render+commit thread. Element-generation is a major contributor to time spend on a single frame, so this doesn't really give us the same benefits to be able to quite easily drive multiple high-refresh rate outputs. On the other side it would make it much more realistic to make the render+commit-thread realtime in the future.Before merge this needs a better git history for reviewing (which I'll attempt tomorrow) and a lot more testing.