emersion / xdg-desktop-portal-wlr

xdg-desktop-portal backend for wlroots
MIT License
593 stars 56 forks source link

After sharing a few times (~3) it no longer works (using systemd unit) #32

Closed cristobaltapia closed 4 years ago

cristobaltapia commented 4 years ago

Hi,

this is working pretty good in general, but the main problem is that after sharing and stop sharing the screen a few times, it no longer works. I have tested this in Chromium and Firefox. I am using the systemd unit, where I have edited the ExecStart lines as follows, so that it works with the browsers and uses the output I want:

ExecStart=/usr/lib/xdg-desktop-portal-wlr -p BGRx -o DP-1

The status of xdg-desktop-portal.service shows the following error:

~$ systemctl status --user xdg-desktop-portal.service 
● xdg-desktop-portal.service - Portal service
     Loaded: loaded (/usr/lib/systemd/user/xdg-desktop-portal.service; static; vendor preset: enabled)
     Active: active (running) since Wed 2020-04-29 08:57:28 CEST; 2h 12min ago
   Main PID: 5898 (xdg-desktop-por)
     CGroup: /user.slice/user-1000.slice/user@1000.service/xdg-desktop-portal.service
             └─5898 /usr/lib/xdg-desktop-portal

Apr 29 08:57:28 tapia-laptop systemd[5544]: Starting Portal service...
Apr 29 08:57:28 tapia-laptop systemd[5544]: Started Portal service.
Apr 29 08:57:58 tapia-laptop xdg-desktop-por[5898]: Failed to get application states: GDBus.Error:org.freedesktop.portal.Error.Failed: Could not get window list

Could you tell me, how to further debug this?

Thanks!

danshick commented 4 years ago

If this is what others have reported, it is a known issue and I think it is an upstream bug with pipewire. We've just identified a back trace in a child process of pipewire, and I'll work on sharing it upstream to find a resolution. For now, try running systemctl --user stop pipewire, then kill any active xdpw instances and try again. Pipewire will restart automatically.

Let me know if that fixes it. You may also have a crashed instance of pipewire-media-session if you run coredumpctl. I'd be curious to know if that's the case.

danshick commented 4 years ago

@wtay over in #pipewire took a look and thinks this might be a fix. For anyone experiencing this issue, if you're willing, can you pull this commit of pipewire and give it a shot? And please report back if this fixes the issue.

https://gitlab.freedesktop.org/pipewire/pipewire/-/commit/0380c6d91e87edf5102c6bada15b07f4f7befe4a

danshick commented 4 years ago

For completeness, I'm attaching a backtrace related to this issue. backtrace.log

danshick commented 4 years ago

@cristobaltapia FYI

Okay, I've now tested dozens of casts, starting and stopping, in parallel and serially. It seems rock solid now.

I think this is fixed, but I'll leave the issue open for a couple of days in case we have any more people experiencing issues. If you are running pipewire master and you get a pipewire-media-session segfault, please let us know and attach a backtrace to your comment if you can. Thanks!

Quick note for anyone wanting to test, pipewire-git in AUR is currently broken. Needed to update the _pick lines to:

--removed--

I've also commented on the AUR package, so it'll probably be fixed soon. Just read the PKGBUILD before you use it.

Edit: The PKGBUILD is now fixed.

cristobaltapia commented 4 years ago

@danshick "Rock solid" seems like an accurate description to me ;) I have tested this with the new pipewire patch and it is working perfectly. Right on time for a video-workshop I was planning to give next week at the university.

Thanks!

danshick commented 4 years ago

Arch has updated the official pipewire package to version 0.3.4, so you will no longer need to use the pipewire-git AUR package to use this fix on Arch.

I've not seen any more complaints about this bug, so I'm gonna close it. If you think you are still experiencing intermittent failures that require a pipewire restart, you have segfaults in pipewire-media-session, and you're using version 0.3.3 or newer, comment here and I'll reopen this.

cristobaltapia commented 4 years ago

Hi @danshick . Yesterday I experienced some problems when sharing my screen using chromium. To be precise, I was using WebexTeams, and started it with the option --app=... so that it would look like a native app (since the "nativefied" version does not work because of the known problem with electron). I had to restart pipewire to share the screen the first time. Then, after a while (over an hour) it stopped sharing the screen. I opened Firefox on another window to test if it would work there, and it did! And then it also worked on Webex Teams again (so I didn't have to manually restart pipewire).

I tried to reproduce this with chromium --app=https://mozilla.github.io/webrtc-landing/gum_test.html and at some point it did failed to capture the screen (at least the preview did not work). Can you reproduce this as well?

wiktor-k commented 4 years ago

@cristobaltapia, are you sure you're using pipewire >= 0.3.4?

cristobaltapia commented 4 years ago

No, I went back to the git version, because pipewire 0.3.4 actually gave me a problem (which I forgot to report here due to time issues). But I haven't updated it since last week, after the patch mentioned above was merged. Correction: apparently I was not using the git version. I was using 0.3.4-1 from the arch repos.

cristobaltapia commented 4 years ago

Otherwise it really always works. That is why it was so strange yesterday.

danshick commented 4 years ago

That's what i use. 0.3.4 is new enough to have the fix for this issue. Might be something new. Do you see anything strange in your logs (journalctl) or any coredumps (coredumpctl) from interesting players (xdp, xdpw, pipewire)?

Also, which Firefox are you using and what version is it?

cristobaltapia commented 4 years ago

I can see this with journalctl | grep pipewire:

A lot of these lines:

May 06 13:08:29 tapia-laptop pipewire[123442]: 'CHECK_OUT_PORT(this, SPA_DIRECTION_OUTPUT, port_id)' failed at ../pipewire/src/modules/module-client-node/v0/client-node.c:796 impl_node_port_reuse_buffer()

Sometimes these lines:

May 06 13:08:27 tapia-laptop pipewire[123442]: [W][000028717.844909][impl-node.c:337 suspend_node()] node 0x55fc4047eaa0: error unset format input: Input/output error
May 06 13:08:27 tapia-laptop pipewire[123442]: [W][000028717.913336][module-protocol-native.c:375 client_new()] server 0x55fc4043f440: no peersec: Protocol not available
May 06 13:08:28 tapia-laptop pipewire[123442]: [W][000028717.923245][impl-client.c:608 pw_impl_client_update_permissions()] client 0x55fc4047fd60: invalid global -1508526656
May 06 13:08:28 tapia-laptop pipewire[123442]: [W][000028717.935036][connection.c:313 prepare_packet()] old version detected

And also this

May 06 13:08:29 tapia-laptop pipewire[123442]: [W][000028719.765579][impl-node.c:337 suspend_node()] node 0x55fc40557e80: error unset format input: Input/output error
May 06 13:08:29 tapia-laptop kernel: xdg-desktop-por[123723]: segfault at 7f167f95d000 ip 00007f167f66413b sp 00007f167f463a98 error 6 in libpipewire-0.3.so.0.304.0[7f167f656000+40000]
                                                       #0  0x00007f167f66413b n/a (libpipewire-0.3.so.0 + 0x3013b)
                                                       #1  0x00007f167ec0e12f n/a (libpipewire-module-client-node.so + 0x1c12f)
                                                       #5  0x00007f167f68d590 n/a (libpipewire-0.3.so.0 + 0x59590)
                                                       #3  0x00007f167ec0992b n/a (libpipewire-module-client-node.so + 0x1792b)
                                                       #4  0x00007f167ec0f573 n/a (libpipewire-module-client-node.so + 0x1d573)
                                                       #5  0x00007f167ec016f1 n/a (libpipewire-module-client-node.so + 0xf6f1)
                                                       #6  0x00007f167ec49adf n/a (libpipewire-module-protocol-native.so + 0x21adf)
                                                       #7  0x00007f167ec4a038 n/a (libpipewire-module-protocol-native.so + 0x22038)
May 06 13:08:30 tapia-laptop pipewire[123442]: [W][000028720.500096][module-protocol-native.c:375 client_new()] server 0x55fc4043f440: no peersec: Protocol not available
May 06 13:08:31 tapia-laptop pipewire[123442]: [W][000028721.608977][module-protocol-native.c:375 client_new()] server 0x55fc4043f440: no peersec: Protocol not available
May 06 13:08:31 tapia-laptop pipewire[123442]: [W][000028721.609353][module-protocol-native.c:375 client_new()] server 0x55fc4043f440: no peersec: Protocol not available
May 06 13:08:31 tapia-laptop pipewire[123442]: [W][000028721.609667][impl-client.c:608 pw_impl_client_update_permissions()] client 0x55fc4047f9b0: invalid global -1508526656
May 06 13:08:31 tapia-laptop pipewire[123442]: [W][000028721.610236][impl-client.c:608 pw_impl_client_update_permissions()] client 0x55fc405546f0: invalid global -1508526656
May 06 13:08:31 tapia-laptop pipewire[123442]: [W][000028721.618586][connection.c:313 prepare_packet()] old version detected
May 06 13:08:31 tapia-laptop pipewire[123442]: [W][000028721.619073][connection.c:313 prepare_packet()] old version detected
May 06 13:08:31 tapia-laptop pipewire[123442]: 'CHECK_OUT_PORT(this, SPA_DIRECTION_OUTPUT, port_id)' failed at ../pipewire/src/modules/module-client-node/v0/client-node.c:796 impl_node_port_reuse_buffer()
May 06 13:08:31 tapia-laptop pipewire[123442]: 'CHECK_OUT_PORT(this, SPA_DIRECTION_OUTPUT, port_id)' failed at ../pipewire/src/modules/module-client-node/v0/client-node.c:796 impl_node_port_reuse_buffer()
May 06 13:08:33 tapia-laptop pipewire[123442]: 'CHECK_OUT_PORT(this, SPA_DIRECTION_OUTPUT, port_id)' failed at ../pipewire/src/modules/module-client-node/v0/client-node.c:796 impl_node_port_reuse_buffer()
May 06 13:08:33 tapia-laptop pipewire[123442]: 'CHECK_OUT_PORT(this, SPA_DIRECTION_OUTPUT, port_id)' failed at ../pipewire/src/modules/module-client-node/v0/client-node.c:796 impl_node_port_reuse_buffer()
May 06 13:08:34 tapia-laptop pipewire[123442]: [W][000028724.000272][impl-node.c:337 suspend_node()] node 0x55fc4056d8d0: error unset format input: Input/output error
May 06 13:08:34 tapia-laptop pipewire[123442]: [W][000028724.002977][impl-node.c:337 suspend_node()] node 0x55fc40570050: error unset format input: Input/output error
May 06 13:08:34 tapia-laptop pipewire[123442]: [W][000028724.061301][module-protocol-native.c:375 client_new()] server 0x55fc4043f440: no peersec: Protocol not available
May 06 13:08:34 tapia-laptop pipewire[123442]: [W][000028724.062406][impl-client.c:608 pw_impl_client_update_permissions()] client 0x55fc40555880: invalid global -1508526656
May 06 13:08:34 tapia-laptop pipewire[123442]: [W][000028724.065600][connection.c:313 prepare_packet()] old version detected

and this:

 No such file or directory"p pipewire[123447]: [E][000028615.299921][media-session.c:1629 core_error()] error id:4 seq:4 res:-2 (No such file or directory): can't create device: No such file or directory
May 06 13:06:45 tapia-laptop kernel: CPU: 3 PID: 123447 Comm: pipewire-media- Tainted: G        W  OE     5.4.38-1-lts #1
May 06 13:06:45 tapia-laptop kernel: CPU: 3 PID: 123447 Comm: pipewire-media- Tainted: G        W  OE     5.4.38-1-lts #1
May 06 13:06:45 tapia-laptop kernel: CPU: 3 PID: 123447 Comm: pipewire-media- Tainted: G        W  OE     5.4.38-1-lts #1
May 06 13:06:45 tapia-laptop kernel: CPU: 3 PID: 123447 Comm: pipewire-media- Tainted: G        W  OE     5.4.38-1-lts #1
May 06 13:06:45 tapia-laptop pipewire[123442]: [E][000028615.412201][alsa-pcm.c:33 spa_alsa_open()] open failed: Device or resource busy
May 06 13:06:45 tapia-laptop pipewire[123442]: [W][000028615.412236][adapter.c:174 find_format()] adapter 0x55fc404f9ec0: no format given
May 06 13:06:45 tapia-laptop pipewire[123442]: [E][000028615.412262][module-adapter.c:237 create_object()] usage: node.name=<string> 
May 06 13:06:45 tapia-laptop pipewire[123442]: [E][000028615.412275][private.h:218 pw_core_resource_errorv()] resource 0x55fc40492910: id:24 seq:84 res:-22 (Invalid argument) msg:"usage: node.name=<string> "
May 06 13:06:45 tapia-laptop pipewire[123442]: [E][000028615.412655][alsa-pcm.c:33 spa_alsa_open()] open failed: Device or resource busy
May 06 13:06:45 tapia-laptop pipewire[123442]: [W][000028615.412674][adapter.c:174 find_format()] adapter 0x55fc404fd190: no format given
May 06 13:06:45 tapia-laptop pipewire[123442]: [E][000028615.412687][module-adapter.c:237 create_object()] usage: node.name=<string> 
May 06 13:06:45 tapia-laptop pipewire[123442]: [E][000028615.412698][private.h:218 pw_core_resource_errorv()] resource 0x55fc40492910: id:25 seq:85 res:-22 (Invalid argument) msg:"usage: node.name=<string> "
May 06 13:06:45 tapia-laptop pipewire[123447]: [E][000028615.413479][core.c:71 core_event_error()] core 0x55da119ceea0: proxy 0x55da11a2ed20 id:24: seq:84 res:-22 (Invalid argument) msg:"usage: node.name=<string> "ay 06 13:06:45 tapia-laptop pipewire[123447]: [E][000028615.413507][media-session.c:1629 core_error()] error id:24 seq:84 res:-22 (Invalid argument): usage: node.name=<string> 
May 06 13:06:45 tapia-laptop pipewire[123447]: [E][000028615.413519][core.c:71 core_event_error()] core 0x55da119ceea0: proxy 0x55da11a2fbc0 id:25: seq:85 res:-22 (Invalid argument) msg:"usage: node.name=<string> "ay 06 13:06:45 tapia-laptop pipewire[123447]: [E][000028615.413531][media-session.c:1629 core_error()] error id:25 seq:85 res:-22 (Invalid argument): usage: node.name=<string> 
May 06 13:06:57 tapia-laptop pipewire[123442]: [W][000028627.520449][module-protocol-native.c:375 client_new()] server 0x55fc4043f440: no peersec: Protocol not available
May 06 13:07:00 tapia-laptop pipewire[123442]: [W][000028630.572811][module-protocol-native.c:375 client_new()] server 0x55fc4043f440: no peersec: Protocol not available
May 06 13:07:10 tapia-laptop pipewire[123442]: [W][000028640.058929][module-protocol-native.c:375 client_new()] server 0x55fc4043f440: no peersec: Protocol not available
May 06 13:07:10 tapia-laptop pipewire[123442]: [W][000028640.060090][impl-client.c:608 pw_impl_client_update_permissions()] client 0x55fc404800e0: invalid global -1508526656
May 06 13:07:10 tapia-laptop pipewire[123442]: [W][000028640.069387][connection.c:313 prepare_packet()] old version detected
May 06 13:07:10 tapia-laptop pipewire[123442]: [W][000028640.069738][module-protocol-native.c:375 client_new()] server 0x55fc4043f440: no peersec: Protocol not available
May 06 13:07:10 tapia-laptop pipewire[123442]: [W][000028640.070239][impl-client.c:608 pw_impl_client_update_permissions()] client 0x55fc40553ca0: invalid global -1508526656
May 06 13:07:10 tapia-laptop pipewire[123442]: [W][000028640.071634][connection.c:313 prepare_packet()] old version detected
May 06 13:07:14 tapia-laptop pipewire[123442]: [W][000028644.272528][impl-node.c:337 suspend_node()] node 0x55fc40557580: error unset format input: Input/output error
May 06 13:07:14 tapia-laptop pipewire[123442]: [W][000028644.378778][module-protocol-native.c:375 client_new()] server 0x55fc4043f440: no peersec: Protocol not available
May 06 13:07:14 tapia-laptop pipewire[123442]: [W][000028644.379366][impl-client.c:608 pw_impl_client_update_permissions()] client 0x55fc40481d50: invalid global 0
May 06 13:07:14 tapia-laptop pipewire[123442]: [E][000028644.379384][private.h:218 pw_core_resource_errorv()] resource 0x55fc40422c80: id:0 seq:5 res:-13 (Permission denied) msg:"no permission to call method 2 on 0 (requires 00000040, have 00000000)"[123442]: [W][000028644.390384][connection.c:313 prepare_packet()] old version detected
May 06 13:07:48 tapia-laptop pipewire[123442]: [E][000028678.843310][impl-link.c:100 pw_impl_link_update_state()] (47.0 -> 82.0) negotiating -> error (no output format)
May 06 13:07:48 tapia-laptop pipewire[123442]: [W][000028678.848430][module-protocol-native.c:375 client_new()] server 0x55fc4043f440: no peersec: Protocol not available
May 06 13:07:54 tapia-laptop pipewire[123442]: [W][000028684.649013][impl-node.c:337 suspend_node()] node 0x55fc40557500: error unset format input: Input/output error
May 06 13:08:24 tapia-laptop pipewire[123442]: [W][000028714.763831][module-protocol-native.c:375 client_new()] server 0x55fc4043f440: no peersec: Protocol not available
May 06 13:08:24 tapia-laptop pipewire[123442]: [W][000028714.764607][impl-client.c:608 pw_impl_client_update_permissions()] client 0x55fc40463e90: invalid global -1508526656
May 06 13:08:24 tapia-laptop pipewire[123442]: [W][000028714.767240][connection.c:313 prepare_packet()] old version detected
May 06 13:08:24 tapia-laptop pipewire[123442]: [W][000028714.767835][connection.c:313 prepare_packet()] old version detected

Edit: Firefox 75 (fedora-firefox-wayland)

danshick commented 4 years ago

You've definitely got a segfault in there. What do the newest entries look like when you run coredumpctl?

cristobaltapia commented 4 years ago

I get this from yesterday:

Wed 2020-05-06 13:08:30 CEST 123722  1000  1000  11 present   /usr/lib/xdg-desktop-portal-wlr

But that is the only one. I also had problems at least one time before that and one time after that. Could this segfault be specifically related to me restarting the systemd units of pipewire, xdg-desktop-portal and xdg-desktop-portal-wlr?

danshick commented 4 years ago

I don't know for sure, but it could be because of restarting pipewire during a stream. Can you run 'coredumpctl debug 123722' and then run 'bt full' from gdb? It almost certainly doesn't have debugging symbols compiled in, but it might give me a hint.

Have you found a reliable way to trigger it?

cristobaltapia commented 4 years ago
[tapia@tapia-laptop]:~$ coredumpctl debug 123722
           PID: 123722 (xdg-desktop-por)
           UID: 1000 (tapia)
           GID: 1000 (tapia)
        Signal: 11 (SEGV)
     Timestamp: Wed 2020-05-06 13:08:29 CEST (1 day 4h ago)
  Command Line: /usr/lib/xdg-desktop-portal-wlr -p BGRx -o DP-1
    Executable: /usr/lib/xdg-desktop-portal-wlr
 Control Group: /user.slice/user-1000.slice/user@1000.service/xdg-desktop-portal-wlr.service
          Unit: user@1000.service
     User Unit: xdg-desktop-portal-wlr.service
         Slice: user-1000.slice
     Owner UID: 1000 (tapia)
       Boot ID: c171270b37e14914978d9d0fe299393d
    Machine ID: 81a7f722a0ba49e19b273b89c2b0f42f
      Hostname: tapia-laptop
       Storage: /var/lib/systemd/coredump/core.xdg-desktop-por.1000.c171270b37e14914978d9d0fe299393d.123722.1588763309000000000000.lz4
       Message: Process 123722 (xdg-desktop-por) of user 1000 dumped core.

                Stack trace of thread 123723:
                #0  0x00007f167f66413b n/a (libpipewire-0.3.so.0 + 0x3013b)
                #1  0x00007f167ec0e12f n/a (libpipewire-module-client-node.so + 0x1c12f)
                #2  0x00007f167f9e2be8 n/a (libspa-support.so + 0x5be8)
                #3  0x00007f167f9e179e n/a (libspa-support.so + 0x479e)
                #4  0x00007f167f9e1ccc n/a (libspa-support.so + 0x4ccc)
                #5  0x00007f167f68d590 n/a (libpipewire-0.3.so.0 + 0x59590)
                #6  0x00007f167f60246f start_thread (libpthread.so.0 + 0x946f)
                #7  0x00007f167f8903d3 __clone (libc.so.6 + 0xff3d3)

                Stack trace of thread 123722:
                #0  0x00007f167f8814fc __read (libc.so.6 + 0xf04fc)
                #1  0x00007f167f9e0ab4 n/a (libspa-support.so + 0x3ab4)
                #2  0x00007f167f9e2e8f n/a (libspa-support.so + 0x5e8f)
                #3  0x00007f167ec0992b n/a (libpipewire-module-client-node.so + 0x1792b)
                #4  0x00007f167ec0f573 n/a (libpipewire-module-client-node.so + 0x1d573)
                #5  0x00007f167ec016f1 n/a (libpipewire-module-client-node.so + 0xf6f1)
                #6  0x00007f167ec49adf n/a (libpipewire-module-protocol-native.so + 0x21adf)
                #7  0x00007f167ec4a038 n/a (libpipewire-module-protocol-native.so + 0x22038)
                #8  0x00007f167f9e1ccc n/a (libspa-support.so + 0x4ccc)
                #9  0x0000557bee81f665 n/a (xdg-desktop-portal-wlr + 0x3665)
                #10 0x00007f167f7b8023 __libc_start_main (libc.so.6 + 0x27023)
                #11 0x0000557bee81f95e n/a (xdg-desktop-portal-wlr + 0x395e)

GNU gdb (GDB) 9.1
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/lib/xdg-desktop-portal-wlr...
(No debugging symbols found in /usr/lib/xdg-desktop-portal-wlr)
[New LWP 123723]
[New LWP 123722]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
Core was generated by `/usr/lib/xdg-desktop-portal-wlr -p BGRx -o DP-1'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f167f66413b in ?? () from /usr/lib/libpipewire-0.3.so.0
[Current thread is 1 (Thread 0x7f167f464700 (LWP 123723))]
(gdb) bt full
#0  0x00007f167f66413b in ?? () from /usr/lib/libpipewire-0.3.so.0
No symbol table info available.
#1  0x00007f167ec0e12f in ?? () from /usr/lib/pipewire-0.3/libpipewire-module-client-node.so
No symbol table info available.
#2  0x00007f167f9e2be8 in ?? () from /usr/lib/spa-0.2/support/libspa-support.so
No symbol table info available.
#3  0x00007f167f9e179e in ?? () from /usr/lib/spa-0.2/support/libspa-support.so
No symbol table info available.
#4  0x00007f167f9e1ccc in ?? () from /usr/lib/spa-0.2/support/libspa-support.so
No symbol table info available.
#5  0x00007f167f68d590 in ?? () from /usr/lib/libpipewire-0.3.so.0
No symbol table info available.
#6  0x00007f167f60246f in start_thread () from /usr/lib/libpthread.so.0
No symbol table info available.
#7  0x00007f167f8903d3 in clone () from /usr/lib/libc.so.6
No symbol table info available.
cristobaltapia commented 4 years ago

I havnt't have time today to try to reliably reproduce this. I will report as soon as I have done it (if I can).

danshick commented 4 years ago

This crash is pretty deep in the linked pipewire code from the looks of it. I'll see if i can trigger it from long duration use with debugging symbols compiled in.