Closed totaam closed 9 years ago
Connected using Windows 8.1 and a Win32 0.15.0 beta build 9445 against a Fedora 21 trunk 9445 build:
Launching the server with
xpra start :13 --no-daemon --bind-tcp=0.0.0.0:2200 --html=on --start-child=firefox --start-child=xterm
Connecting with
Xpra_cmd.exe attach tcp:ip:port --encodings=h264
works as expected.Reconnecting with just
Xpra_cmd.exe attach tcp:ip:port
works for a short time. Interacting with Firefox on the remote machine causes the following traceback client side to be printed a few times and a Segmentation fault(core dump) on the server:Traceback (most recent call last): File "xpra\client\ui_client_base.pyc", line 1975, in _draw_thread_loop File "xpra\client\ui_client_base.pyc", line 2021, in _do_draw File "xpra\client\client_window_base.pyc", line 423, in draw_region File "xpra\client\window_backing_base.pyc", line 473, in draw_region File "xpra\client\window_backing_base.pyc", line 264, in paint_rgb24 File "xpra\client\window_backing_base.pyc", line 175, in process_delta Exception: expected 5976 bytes for 83x24 with rowstride=249 but received 26 (34 compressed) 2015-05-18 11:23:42,654 internal error: read connection SocketConnection(('10.0. 11.124', 57327) - ('10.0.32.138', 2200)) reset: [Errno 10054] An existing connec tion was forcibly closed by the remote host
Switching to a 0.15.0 9445 build on the server and launching with the same parameters does not disconnect me, but the session appears to freeze after a second or so and I get the following tracebacks:
server side:
2015-05-18 11:36:28,543 error processing damage data: failed to get buffer from pixel object: <type 'memoryview'> (returned -1) Traceback (most recent call last): File "/usr/lib64/python2.7/site-packages/xpra/server/source.py", line 1734, in encode_loop fn_and_args[0](*fn_and_args[1:]) File "/usr/lib64/python2.7/site-packages/xpra/server/window_source.py", line 1187, in make_data_packet_cb packet = self.make_data_packet(damage_time, process_damage_time, wid, image, coding, sequence, options) File "/usr/lib64/python2.7/site-packages/xpra/server/window_source.py", line 1524, in make_data_packet ret = encoder(coding, image, options) File "/usr/lib64/python2.7/site-packages/xpra/server/window_source.py", line 1595, in webp_encode return webp_encode(coding, image, self.rgb_formats, self.supports_transparency, q, s, options) File "/usr/lib64/python2.7/site-packages/xpra/server/picture_encode.py", line 62, in webp_encode cdata = enc_webp.compress(image.get_pixels(), w, h, stride=stride/4, quality=quality, speed=speed, has_alpha=alpha) File "xpra/codecs/webp/encode.pyx", line 342, in xpra.codecs.webp.encode.compress (xpra/codecs/webp/encode.c:1839) AssertionError: failed to get buffer from pixel object: <type 'memoryview'> (returned -1)
and client side:
2015-05-18 11:36:28,759 error processing draw packet Traceback (most recent call last): File "xpra\client\ui_client_base.pyc", line 1975, in _draw_thread_loop File "xpra\client\ui_client_base.pyc", line 2021, in _do_draw File "xpra\client\client_window_base.pyc", line 423, in draw_region File "xpra\client\window_backing_base.pyc", line 473, in draw_region File "xpra\client\window_backing_base.pyc", line 264, in paint_rgb24 File "xpra\client\window_backing_base.pyc", line 175, in process_delta Exception: expected 7488 bytes for 117x16 with rowstride=468 but received 26 (34 compressed)
- Using the 0.15.0 server build, connecting with
--encodings=h264
connects with solid black windows and the following tracebacks on the server:2015-05-18 11:40:38,854 error processing damage data: Traceback (most recent call last): File "/usr/lib64/python2.7/site-packages/xpra/server/source.py", line 1734, in encode_loop fn_and_args[0](*fn_and_args[1:]) File "/usr/lib64/python2.7/site-packages/xpra/server/window_source.py", line 1187, in make_data_packet_cb packet = self.make_data_packet(damage_time, process_damage_time, wid, image, coding, sequence, options) File "/usr/lib64/python2.7/site-packages/xpra/server/window_source.py", line 1524, in make_data_packet ret = encoder(coding, image, options) File "/usr/lib64/python2.7/site-packages/xpra/server/window_video_source.py", line 1261, in video_encode ret = self._video_encoder.compress_image(csc_image, quality, speed, options) File "xpra/codecs/enc_x264/encoder.pyx", line 520, in xpra.codecs.enc_x264.encoder.Encoder.compress_image (xpra/codecs/enc_x264/encoder.c:5861) AssertionError
- Of note when connecting with
--encodings=h264
:
- The client does not print any tracebacks or errors when connecting.
- Interacting with Firefox still works( for example, using
ctrl + t
to open a tab and the window title changes), but it's impossible to see anything.
0.15 should not be using memoryview by default, where did you get this build? What build command was used? As usual, having "xpra info" would help clarify things.
Is reconnecting necessary to get the first crash? I'll try to get a gdb backtrace tomorrow - assuming I can reproduce. Feel free to beat me to it.
The failure in the x264 encoder is hard to diagnose because 0.15.x doesn't have the debug code - I may have to backport it, unless you can reproduce with trunk? (you may need to build it with "--without-memoryview" to get the same behaviour as 0.15.x)
The builds are from the trunk or 0.15.0 tagged repositories I have on one of my Fedora 21 test VMs.
I use the following command to build:
LDFLAGS=-Wl,-rpath=/usr/lib64/xpra PKG_CONFIG_PATH=$PKG_CONFIG_PATH:/usr/lib64/xpra/pkgconfig ./setup.py install
Re-building with:
LDFLAGS=-Wl,-rpath=/usr/lib64/xpra PKG_CONFIG_PATH=$PKG_CONFIG_PATH:/usr/lib64/xpra/pkgconfig ./setup.py install --without-memoryview
- Connecting with
--encodings=h264
works- Connecting with
--encodings=h264,png
works- Connecting with
--encodings=h264,png,webp
works- Connecting with
--encodings=webp
works- Connecting without a specified encoder causes a seg-fault.
Switching VMs to another Fedora 21 machine using your latest Fedora 21 build from the beta repo works fine. It looks like it's an issue with my build environment.
I'll put my money on webp issues. You can confirm this by enabling just webp, or by enabling everything but webp.
You need to make sure that you have the same webp at build time and at runtime. This is actually a known problem with webp, see #848.
Relaunched with
xpra start :13 --no-daemon --bind-tcp=0.0.0.0:2200 --html=on --start-child=firefox --start-child=xterm --start-child=xterm --encodings=h264,png,jpeg,rgb
- connecting without specifying an encoder causes a crash with the following client output:
2015-05-18 14:10:17,647 error processing draw packet Traceback (most recent call last): File "xpra\client\ui_client_base.pyc", line 1975, in _draw_thread_loop File "xpra\client\ui_client_base.pyc", line 2021, in _do_draw File "xpra\client\client_window_base.pyc", line 423, in draw_region File "xpra\client\window_backing_base.pyc", line 473, in draw_region File "xpra\client\window_backing_base.pyc", line 264, in paint_rgb24 File "xpra\client\window_backing_base.pyc", line 175, in process_delta Exception: expected 4964 bytes for 1241x1 with rowstride=4964 but received 26 (3 4 compressed)
Relaunching with
xpra start :13 --no-daemon --bind-tcp=0.0.0.0:2200 --html=on --start-child=firefox --start-child=xterm --start-child=xterm --encodings=h264,png,jpeg
- Connecting without specifying an encoder no longer causes a crash.
Relaunching with
xpra start :13 --no-daemon --bind-tcp=0.0.0.0:2200 --html=on --start-child=firefox --start-child=xterm --start-child=xterm --encodings=h264,png,jpeg,webp
- Connecting without specifying an encoder also does not crash.
Interestingly it looks like enabling rgb is what's causing segfaults here
EDIT: changed wording for consistency EDIT2: Added client output on crash. EDIT3: I am bad at copy-paste. Fixed.
It looks like the client is still a win32 system of some sort? (the log output is wrapping at 80 characters) Have you tried connecting from the same Fedora machine? Does it make any difference? (probably needs to have mmap turned off to trigger the bug)
The first 2 command lines in comment:6 are identical. But you said "Relaunching with.." which seems to imply that maybe it should be a different command?. Why would it work better the second time? Was anything else changed? Do I need firefox to trigger it? Any page in particular?
(general advice: always best to trim down the command lines and remove things that aren't relevant to the bug. ie: if html is not used, take it off, if sound or clipboard aren't relevant then turn them off, if you don't need two xterms then don't start two - if you need two during testing then test again afterwards without, etc..)
Assuming that the problem is with rgb, please try with "-z 0" and "-z 9" to see if it triggers it more easily. Please also provide the "-d encoding" server log, and the "-d paint,delta" client log around the time of the problem.
FWIW: I have tried many times, with different client OS, no crash whatsoever. I did hit this bug: #861, but that's a different issue. (and I tested before doing that fix) Are you sure that your build environment is configured properly? (all the dependencies like libwebp-xpra are installed, etc) Was there anything at all in the server log?
What makes you think that this has something to do with rgb? (the second one did not crash, and it had rgb enabled) Have you tried just with rgb? With h264 + rgb?
When you say it "crashes", is it the server or the client? Where is the crash message? (or does it just print this stacktrace and continue?) (xpra info is still missing.)
Raising priority again..
Updated my server to trunk 9459 and re-built
Using the following commands to setup a server session:
xpra start :13 --no-daemon --bind-tcp=0.0.0.0:2200 --start-child=firefox --start-child=xterm --start-child=xterm
Also I'm connecting from Windows 8.1 using the 9445 Beta Win32 build from [http://xpra.org/beta]. I'd use Fedora to connect but we don't have any hardware Fedora 19/20/21 machines yet...they're not cooperating with our cloning solution, but that's a problem for another time. In addition I have my trusty old Cent6.4 machine that I can use as well
[[br]]
The first 2 command lines in comment:6 are identical.
[[br]]
My bad, I copy and pasted wrong, the working server start didn't have RGB. I'll edit the comment to fix it.
For starters, I am pretty sure this issue has something to do with my build environment, and even then only when the server paints with RGB. If I specify the client to connect with encodings other than RGB, then I can use the session with no problems. If I use a server from your beta repository (same server operating system - Fedora 21, just a different VM) then RGB encoding works fine, even after connecting with the same client, with the same server and client start commands.
That being said, setting up a session with only RGB :
xpra start :13 --no-daemon --bind-tcp=0.0.0.0:2200 --start-child=firefox --start-child=xterm --start-child=xterm --encodings=rgb
And connecting with:
Xpra_cmd.exe attach tcp:10.0.32.137:2200
Connects and the server does not seg-fault(or print any errors, actually), however all my windows are black and I can not interact with anything, and the client floods the CMD window with tracebacks before I disconnect:
2015-05-20 12:58:22,726 error processing draw packet Traceback (most recent call last): File "xpra\client\ui_client_base.pyc", line 1975, in _draw_thread_loop File "xpra\client\ui_client_base.pyc", line 2021, in _do_draw File "xpra\client\client_window_base.pyc", line 423, in draw_region File "xpra\client\window_backing_base.pyc", line 473, in draw_region File "xpra\client\window_backing_base.pyc", line 264, in paint_rgb24 File "xpra\client\window_backing_base.pyc", line 175, in process_delta Exception: expected 630736 bytes for 499x316 with rowstride=1996 but received 26 (34 compressed) 2015-05-20 12:58:22,726 invalid img data <type 'str'>: <memory at 0x7f3a466c0640 > 2015-05-20 12:58:22,726 draw error Traceback (most recent call last): File "xpra\client\ui_client_base.pyc", line 2021, in _do_draw File "xpra\client\client_window_base.pyc", line 423, in draw_region File "xpra\client\window_backing_base.pyc", line 473, in draw_region File "xpra\client\window_backing_base.pyc", line 264, in paint_rgb24 File "xpra\client\window_backing_base.pyc", line 175, in process_delta Exception: expected 630736 bytes for 499x316 with rowstride=1996 but received 26 (34 compressed) 2015-05-20 12:58:22,727 error processing draw packet Traceback (most recent call last): File "xpra\client\ui_client_base.pyc", line 1975, in _draw_thread_loop File "xpra\client\ui_client_base.pyc", line 2021, in _do_draw File "xpra\client\client_window_base.pyc", line 423, in draw_region File "xpra\client\window_backing_base.pyc", line 473, in draw_region File "xpra\client\window_backing_base.pyc", line 264, in paint_rgb24 File "xpra\client\window_backing_base.pyc", line 175, in process_delta Exception: expected 630736 bytes for 499x316 with rowstride=1996 but received 26 (34 compressed) 2015-05-20 12:58:22,732 server requested disconnect: client request 2015-05-20 12:58:22,776 Connection lost
Connecting with a CentOS 6.4 beta client (05/18 build date) gives me the same errors, for what it's worth. I'll try in different OSs if/when I get a chance.
I will also attach the logs you requested. I set up a session with
xpra start :13 --no-daemon --bind-tcp=0.0.0.0:2200 --html=on --start-child=firefox --start-child=xterm --start-child=xterm -d encoding > xpra863encoding.txt 2>&1
and connected from my Cent6.4 machine with :
xpra attach tcp:10.0.32.138:2200 -d paint,delta > xpra863deltapaint.txt 2>&1
Connecting, and clicking on the close button on Firefox, causing it to load the homepage causes a seg-fault on the server.
- I retested this with just
--start-child=xterm --start-child=xterm
and it allows me to interact with the Xterms for a similar amount of time before the server seg-faults again. So, it looks like Firefox isn't required.
Also, using "-z 0" and "-z 9" doesn't seem to have any effect. I'm not sure if I'm using them correctly...I'm just starting the server with
xpra start :13 --no-daemon --bind-tcp=0.0.0.0:2200 --html=on --start-child=firefox --start-child=xterm --start-child=xterm --encodings=rgb -z 9
For what it's worth: Selecting just
h264
as an encoder works, it's just usingRGB
with other encoders that's causing issues....and even then only in my build environment.
xpra863encoding(1).txt
(255.9 KiB)Connecting and loading the firefox home page (the fedora start page) before the server seg faults
xpra863deltapaint.txt
(314.3 KiB)Same steps as the other logs, connecting and loading the fedora firefox start page before a server seg fault. This time from the Cent6.4 client's perspective.
xpra info is still missing.
If I use a server from your beta repository... [[BR]] Then we need to figure out what is different between those two build environments and the packages that they produce. Having xpra info will help, also
ls -l /usr/lib64/xpra/pkgconfig/
, how you build and install the RPM, etc. And maybe PM me one of those problematic packages.[[BR]]
I'd use Fedora to connect but we don't have any hardware Fedora 19/20/21 machines yet. [[BR]] I'm confused, I thought you had a Fedora VM you used as server? comment:4 says I have on one of my Fedora 21 test VMs.
[[BR]]
Exception: expected 630736 bytes for 499x316 with rowstride=1996 but received 26 (34 compressed)
2015-05-20 12:58:22,726 invalid img data <type 'str'>: <memory at 0x7f3a466c0640>
[[BR]] It is the memoryview stuff that is causing this. It is only enabled by default in trunk. I'll look into it.Although I am glad to see trunk getting some testing (any version getting some testing), the focus should be on 0.15 at this point. (I know that I did ask you to run trunk to get a log message previously).
[[BR]]
I will also attach the logs you requested. [[BR]] Thanks, that's very useful. Fixed a bug already I found in there: #865. Any errors in the logs like this one should always be investigated and reported as bugs, whether there is visual corruption on screen or not.
[[BR]]
Connecting, and clicking on the close button on Firefox, causing it to load the homepage causes a seg-fault on the server. [[BR]] Which close button? Close tab? [[BR]]
allows me to interact with the Xterms for a similar amount of time before the server seg-faults again [[BR]] How do you make it crash? Close? Resize? It could well be related to #865. In which case the crash should be gone with latest trunk.
863xprainfo.txt
(105.1 KiB)requested Xpra Info.
I uploaded the Xpra info.
I started the server with
xpra start :13 --no-daemon --bind-tcp=0.0.0.0:2200 --html=on --start-child=firefox --start-child=xterm --start-child=xterm --encodings=rgb
(I use only RGB) because then I can keep the server running long enough to get the requested Xpra Info
and then connected from Windows 8.1 with:
Xpra_cmd.exe attach tcp:10.0.32.138:2200
I should specify our testing environment:
We have a number of Fedora 20/21 VMs (2 Fedora 20 and as many Fedora 21 VMs as we need) but they aren't connected to any display; they just sit on a KVM server somewhere in our server room, so we can't use them as clients at all.
That being said, we do have a number of hardware machines that we can use as clients(or server) to test with, but they aren't playing well with Fedora at the moment, and I haven't had the time to sit down and really investigate the issue we have with them. Other than those machines (including a couple Mac Minis...one with Intel graphics, one with nvidia), I have my laptop(Macbook something that I run Windows on) and a low power Cent6.4 machine. Also whatever machines Alex has access to.
Firefox close button
[[br]]
When Firefox detects that it recovers from a crash, or can't open a new tab it gives you two options. One is restore your previous tabs, and another button marked "close" that just opens a new session. That's the button that I was referring to...all it does is launch a new session.
ls -l /usr/lib64/xpra/pkgconfig/
:total 36 -rw-r--r--. 1 root root 405 Apr 11 07:22 libavcodec.pc -rw-r--r--. 1 root root 422 Apr 11 07:22 libavfilter.pc -rw-r--r--. 1 root root 443 Apr 11 07:22 libavformat.pc -rw-r--r--. 1 root root 270 Apr 11 07:22 libavutil.pc -rw-r--r--. 1 root root 299 Apr 11 07:22 libpostproc.pc -rw-r--r--. 1 root root 307 Apr 11 07:22 libswresample.pc -rw-r--r--. 1 root root 300 Apr 11 07:22 libswscale.pc -rw-r--r--. 1 root root 311 Apr 4 09:32 vpx.pc -rw-r--r--. 1 root root 255 Jan 18 22:49 x264.pc
Finally, I will try to recompile without memoryview and leave a comment in a bit.
they aren't connected to any display; they just sit on a KVM server somewhere in our server room, so we can't use them as clients at all. [[BR]] That's not true: you could use an xpra session to launch another xpra client connecting to another session. (or use VNC or whatever)
[[BR]]
I am in serious need of a recap here. There is more than one issue I think, and so many versions and combinations that it is making my head spin.
- which client / server branches trigger
invalid img data
.- which servers can segfault (and preferably reproduce by resizing glxgears or xterm rather than needing firefox)
And whether this affects all builds or just yours / mine.
Assuming that some of these bugs are still present (and assuming that the option is relevant for the version tested), please try:
- compressor options, as per #866#comment:5
- disable av-sync (for 0.16)
- build with / without memoryview (default is without for 0.15 and earlier, with for 0.16)
If there are remaining issues, maybe we should split them into new tickets to clarify things. The original ticket description is about using
--encodings=h264
, which works fine for me. Not rgb.. The most important thing is to check that 0.15 runs OK, we can worry about 0.16 later.
Okay, recap time:
[[br]]
Firstly,
--encodings=h264
is working flawlessly, so as far as I can tell, everything within the scope of this ticket has been fixed (unless the Encoding issues I'm seeing are directly caused by the fix)....then again it's also your Trac so I'll defer to you if you want to spin off other tickets if you'd prefer.
Using
xpra start :13 --no-daemon --bind-tcp=0.0.0.0:2200 --start-child=xterm
and launching glxgears, I've tested every permutation I can think of using Fedora 21 as a server and Win8.1 and Cent6.4 as clients:
- 0.15.0 branch:
- Fedora 21 server built from source server
- Cent6.4 Beta client repo installed via yum
xpra.org/beta
- Cent6.4 Built client from 0.15.0 source
- Win8.1 Beta client build
xpra.org/beta
- Fedora 21 server installed from
xpra.org/beta
via yum- Cent6.4 Beta client installed via yum
xpra.org/beta
- Cent6.4 Build client from 0.15.0 source
- Win8.1 Beta client build from
xpra.org/beta
- Trunk:
- Fedora 21 server built from trunk source
- Cent6.4 trunk client built from source
The only time I'm getting encoding issues is with my 0.15.0 branch server built from source. In all other instances it works fine. This includes trunk server built from source and trunk client built from source, which are working fine.
In addition I no longer see
invalid img data
in both the 0.15.X branch and trunk.Just for clarity:
- Server is no longer seg-faulting as of r9533
- Building from 0.15.X causes the
AssertionError
that was seen in comment:2
- No error prints client side.
- However glxgears appears to stop drawing. Resizing it will get it to redraw once or twice, but then it stops redrawing entirely
- Using the 0.15.0 beta server from
xpra.org/beta
works fine with no errors- Building from trunk works fine with no errors
- connecting from Win8.1 client from
xpra.org/beta
- connecting from Cent6.4 client built from 0.15.X source
Switching compressors seems to have no noticeable effect.
Building
--with-memoryview
and--without-memoryview
has no noticeable effect in the 0.15.X branch or trunk.
OK, sounds good. The only thing I can think of is that you're hitting a compilation bug, maybe related to this I found in your xpra info:
server.build.cython=0.22.beta0
Can you try updating to 0.22 final to see if that helps?
If not, maybe you can PM me a download link of the compressed virtual machine image that you use so that I can run it here?
Built Cython from their latest release
http://cython.org/#download
in Fedora 21 and Cent6.4, and I'm still getting assert errors, and glxgears totally stops painting after a second or two.
For what it's worth, using other encodings is fine
vp9,vp8,jpeg,web,png,h264
, so I'm only getting these errors when I havergb
enabled.
I'll bother Smo to see if we can get you a compressed image of the machine.
Update:
[[br]]
- Built up a new Fedora 21 VM and installed all the necessary dependencies, etc etc.
[[br]]
Building on the new VM with trunk or 0.15.X works fine. Looks like the issue is confined to that specific machine. If you still want the image to the broken machine let me know.
If you still want the image to the broken machine let me know. [[BR]] I think it would be useful to get to the bottom of this issue, so that we can prevent it in the future. And maybe add a sanity test for the rgb encoder.
@smo: can you help?
I have removed the image from our system and compressed it I can upload it if you like but i'm closing this ticket for now.
Issue migrated from trac ticket # 863
component: encodings | priority: critical | resolution: fixed
2015-05-16 06:00:10: antoine created the issue