raspberrypi / firmware

This repository contains pre-compiled binaries of the current Raspberry Pi kernel and modules, userspace libraries, and bootloader/GPU firmware.
5.06k stars 1.68k forks source link

mmal: Get component port state debug information #377

Closed julianscheel closed 7 years ago

julianscheel commented 9 years ago

Is there a similiar mechanism for mmal to OMX_GetDebugInformation for openmax which enables us to see the state of the active mmal components and their ports? I am looking for an issue where mmal_port_flush() does not seem to work properly on image_fx component - my internal refcounting says that still one buffer is in use by the input port. It would be quite helpful to have a chance to see the internal port state.

popcornmix commented 9 years ago

Yes, /opt/vc/bin/mmal_vc_diag should do what you want. Source: https://github.com/raspberrypi/userland/blob/master/interface/mmal/vc/mmal_vc_diag.c

julianscheel commented 9 years ago

Thanks, this looks very helpful :) While talking about it: Is it required to lock mmal component creation/destruction and port configuration with a mutex when it is possibly done from different threads? In VLC the decoder, filter and vout are running in different threads.

popcornmix commented 9 years ago

I believe mmal and openmax are thread safe, so you don't need additional locking. Obviously you need to protect your own structures (e.g. when accessed from a mmal callback).

julianscheel commented 9 years ago

Ok, this is what I assumed. I'm currently seeing a deadlock when I do aggressive start/stop loops. See this backtrace: http://pastie.org/private/1hnlmgt0guk8dqd8t1oklq

Looking at Thread 6, which is teardown of the decoder module, which just called mmal_component_disable and at Thread 4 which just tries to send a picture to the deinterlace input port it looks like mmal_vc_send_message and mmal_vc_sendwait_message might be blocking each other. Could this be? Maybe @6by9 has some thoughts, too?

julianscheel commented 9 years ago

Here is a slight variation of the deadlock: http://pastie.org/private/glaqerdbule6gg2r3unlq

In this case Thread 5 and 4 seem to block each other - Thread 5 being the deinterlace filter calling mmal_port_send_buffer and Thread 4 the vout calling mmal_port_format_commit. This is due to an aspect ratio change happening right before stopping the playback.

julianscheel commented 9 years ago

And the next hit, this time the flush on image_fx does not seem to finish as expected. My internal refcounting says that 1 buffer is held by the output port still. mmal_vc_diag mmal-stats seems to confirm this:

# mmal_vc_diag mmal-stats
component       port        buffers     fps delay
ril.image_fx            0 [in ]rx   32          27.3    86637
ril.image_fx            0 [in ]tx   32          27.3    86636
ril.image_fx            0 [out]rx   65          54.6    126535
ril.image_fx            0 [out]tx   64          53.9    80008
ril.video_render        0 [in ]rx   135         39.6    507479
ril.video_render        0 [in ]tx   135         39.6    507478

Related stacktrace is here: http://pastie.org/private/vwpmjsqqptmvazvj3jfkdw This time I can't see any mmal calls which might block each other.

Any thoughts?

6by9 commented 9 years ago

Locking around component create/destroy: it should all be internally safe for different threads to be creating and destroying components simultaneously. If multiple threads are talking to the same component, you can also use mmal_component_acquire and mmal_component_release to take an additional ref count on a component from the other threads, so it only actually destroys when both threads have released. See https://github.com/raspberrypi/userland/blob/master/interface/mmal/mmal_component.h

I thought it was possible to do the same refcounting with pools as well, but can't see it at the moment (may have been on a different dev branch - I'm sure I've used it. You can call mmal_pool_destroy on a pool before the last buffer is released, and it only actually gets destroyed when that buffer is finally released. Perhaps I'm misremembering).

Do remember that port_disable will flush back any buffers owned by the components, so ensure that your callback isn't blocked when you make the call. (Doesn't appear to be an issue from your callstacks)

On the VideoCore side, there is a single thread that brokers all MMAL requests from the ARM. A couple of longer running tasks are offloaded to worker threads, but otherwise most things are run sequentially. It may be that something has managed to wedge that broker thread, or otherwise bork VCHI. That really needs the ability to see inside the GPU. @popcornmix did we ever get anywhere with ramdumps and offline analysis for Pi? I know we had it on other platforms but memory says there was some issue with doing it on the Pi.

julianscheel commented 9 years ago

@6by9 Thanks for the feedback. If it would help for analysing things on the GPU I could prepare a SD card image containing the test-case.

6by9 commented 9 years ago

Does it need to be a full SD card image, or just a test app? I'd need to rebuild the firmware anyway to use a debug build, so if it can just be an app on top of a known rpi-update release hash then it would be easier.

popcornmix commented 9 years ago

@6by9 I've not tried to get the ramdump stuff going on Pi. Can't think why it wouldn't work. Did this use vcdbg for analysis or something else?

6by9 commented 9 years ago

@popcornmix - switching to email.

julianscheel commented 9 years ago

@6by9 let me try building VLC in recent raspbian and provide it as tarball. Was a bit tricky last time I did... Making up a minimal test case is probably a hard task in this case.

6by9 commented 9 years ago

If it needs to be a full SD card image then I can cope, just that it's easier to have an environment I know I can hack around.

julianscheel commented 9 years ago

Build is running on a clean raspbian install - I should be able to provide you with prebuilt binaries +source tomorrow.

julianscheel commented 9 years ago

I was able to finally build things properly for raspbian (using a cross toolchain linking against nfs-exported raspbian root actually). Actually the firmware from today breaks everything for mmal+vlc, because the codec component won't emit the initial format changed event at all. With mmal_vc_diag you can see that it's fed with buffers but the output port will never send one out. Is it a problem with the new firmware that we do not push any output buffers to the output port before receiving the first format change buffer? So far those status update buffers did not rely on having picture output buffers ready on the port... I can reproduce this issue on my system, btw.

If I downgrade to what I have on my systems currently (firmware from Feb 4) things are better but still odd and different to how it behaves on my system. After the first few frames have been decoded the whole system stalls for several seconds - no mmal events occur in that time and then it resumes spontaneously. It keeps running smooth until a mmal component create/destroy (ie deinterlace plugin reset on aspect ratio change) occurs. Then it stalls again.

In the hope you might see more than I do I create a package though: http://jusst.de/files/vlc-raspbian.tar.xz

Use it like this:

cd /home/pi
wget http://jusst.de/files/vlc-raspbian.tar.xz
tar -xvf vlc-raspbian.tar.xz

export VLC_PLUGIN_PATH=/home/pi/vlc-raspbian/install/usr/lib/vlc/plugins
export LD_LIBRARY_PATH=/home/pi/vlc-raspbian/install/usr/lib
export PATH=/home/pi/vlc-raspbian/install/usr/bin:$PATH

vlc -vvv somefile.mp4

Additionaly I packaged my aspect-ratio change source - you should ideally run this on another PC which has VLC installed (version shouldn't matter much) - it will stream a continuous mpeg-ts with aspect ratio change every second to 239.2.2.2:10101 (feel free to change the address in the script as you like)

wget http://jusst.de/files/aspect.tar.xz
tar -xvf aspect.tar.xz
cd aspect
./stream-aspect-short.sh

You can then play the stream on the pi using:

with deinterlace: vlc -vvv udp://@239.2.2.2:10101
without deinterlace: vlc -vvv --deinterlace 0 udp://@239.2.2.2:10101

Be aware it is a mpeg2 encoded stream so you need to have the mpeg2 license activated. The case without deinterlace runs smooth after the initial stall is over. The one with deinterlace stalls over and over again because the deinterlace filter is recreated.

And finally some notes for building: The vlc directory contains my full build in the build-cross subdirectory. There are two scripts (configure.sh and build.sh) to help with configuring and building. In my case /tmp/raspbian-root was just a regular raspberry which exported its rootfs via nfs like this:

172.27.0.0/21(rw,async,no_root_squash,no_subtree_check)

The rootfs needs one fix though: /usr/lib/arm-linux-gnueabihf/libdl.so has to be recreated as relative link in favor of the absolute one it normally is in that rootfs

sudo rm /usr/lib/arm-linux-gnueabihf/libdl.so
cd /usr/lib/arm-linux-gnueabihf/
sudo ln -s ../../../lib/arm-linux-gnueabihf/libdl.so.2 libdl.so

I used the linaro toolchain from the tools repository for building - change configure.sh to use your paths...

And finally just in case you don't experience these odd stalls on start/component creation or find a way to fix it you can use this python script to run the actual loop test which I run to hit the deadlocks I reported last days:

wget http://jusst.de/files/stream-zap-test.tar.xz
tar -xvf stream-zap-test.xz
cd stream-zap-test
./stream-loop-test.py udp://@239.2.2.2:10101

This will just restart the playback every 5 seconds, which stresses the component create/destroy cycle in addition to the permanent recreation of image_fx deinterlace due to the aspect change.

The sources for the mmal modules in vlc reside in modules/hw/mmal/ - just in case you're interested in looking into them.

I hope this is somehow understandable :)

popcornmix commented 9 years ago

If you can identify when the firmware broke things, it would be helpful. There were vdec3 changes in https://github.com/Hexxeh/rpi-firmware/commit/0c9001b568d1ac9593925fb76795bcc25cc59558 which should be harmless, but could have changed behaviour.

Also there were some memory reductions here: https://github.com/Hexxeh/rpi-firmware/commit/a8b36a5848f1c236fc7d9f48d8fc1ea758fa5796 which no longer forces at least 7 buffers. Might be worth checking if setting more "extra buffers" on video_decode helps. (Additionally, setting "extra buffers" to 3 when using deinterlace should no longer be required).

julianscheel commented 9 years ago

@popcornmix Hexxeh/rpi-firmware@0c9001b introduces the regression for me.

popcornmix commented 9 years ago

Do files play with omxplayer with new firmware? If not can you provide a video file/player that works with older firmware and fails with newer firmware?

julianscheel commented 9 years ago

@popcornmix omxplayer seems to work well. You can use vlc to see the issue - it seems to appear with mpeg2 codec only. You can use the vlc bundle I uploaded along with the aspect ratio stream. If you need I can try to generate a more simplistic testcase for this, but unfortunately this will be by the end of next week only as I have to leave now and will be out of office until thursday. So only remote access in that time, no real development work...

deborah-c commented 9 years ago

Thanks for the detailed reports; I'll look at this as soon as possible. Unfortunately the change to the video decoder firmware at Hexxeh/rpi-firmware@0c9011b is quite large, and we don't have the same ability to regression test it at Pi Towers that used to be possible at Broadcom (we're working on it...). However, if this only affects MPEG-2, we should be able to get to the bottom of it quickly.

julianscheel commented 9 years ago

@6by9 Thank you. Regarding mpeg2 only: I wouldn't take this as proven so far - but probable. So it probably makes sense to look for mpeg2 related things first regarding the regression.

popcornmix commented 9 years ago

@julianscheel can you try latest rpi-update firmware? Includes a fix from @deborah-c for MPEG-2 decode.

julianscheel commented 9 years ago

I just did a quick remote login and it seems the decoder starts up properly again now. Thanks. So we can get back to the deadlocks. On the raspbian build I still experience these odd stalls on component startup though. Maybe one of you guys can see something within the GPU which could explain that stall in the first place? I could imagine it's related to the deadlocks I see with my rootfs, just slightly different symptomatics.

deborah-c commented 9 years ago

I found that MPEG-2 decode worked poorly using that build: it complains that the TS demuxer is an invalid module, and playing from a .VOB file had very slow (~10s) startup, and then played jerkily. I'm not sure how VLC works internally -- does it use our A/V sync, or is that done on the ARM side of the world?

The regression wasn't MPEG-2 specific, but applied to all non-H.264 streams, and failed to produce images after a resolution change. VLC appears to create the initial image pool differently from omxplayer -- the latter generated a pool the same dimension that the decoder wanted, whereas VLC asked for larger images, causing an immediate resolution change.

julianscheel commented 9 years ago

@deborah-c Ah sorry, I forgot to post the note about the dependencies. This is what I installed for doing the raspbian build:

sudo apt-get install libavcodec54 libavformat54 libdvbpsi-dev libavcodec-dev libavutil-dev libasound2-dev libtool automake libswscale-dev

libdvbpsi is what you need for the ts demux to work.

These ~10s on statup are what I see on the raspbian build as well. At a first glance it did not look like some a/v sync issue - the a/v sync is handled on the ARM side with VLC, it's using alsa for audio output. I actually overclock the Pi to 950/500/500 usually, but streams with not-so-high bandwidth should work well with lower frequencies. When playing a VOB file it might be that some module in the input chain did not work very well for you. TS files/streams are what I test with usually. So after installing the dependencies you should be able to play that aspect changing stream.

Thanks for the init analysis and fix. I wonder a bit why we create a pool with a different size initially, I will take a look into it. Maybe we can avoid it.

julianscheel commented 9 years ago

@deborah-c, @6by9 I've prepared a new build of vlc for raspbian which fixes that startup stall. The old one was missing an initialisation which was not exhibited when compiling with my buildroot toolchain and caused an arbitrary offset being applied to the timestamps. I have rebased the code against latest vlc git HEAD, which unfortunately dropped support for libdvbpsi 0.7 series, so I had to update to libdvbpsi from jessie:

sudo apt-get install -t jessie libdvbpsi9 libdvbpsi-dev

I also use gdb from jessie, because the wheezy gdb does not properly support the dwarf4 debug symbols generated by the toolchain.

Once you have updated the dependencies and VLC you should be able to use the aspect streaming script on another machine and start the testcase with

./stream-loop-test.py udp://@239.2.2.2:10101

as described in my older post. Just let this one run for a while (sometimes it runs quite some hours without failing). It will change aspect ratio frequently and restart the stream every 5 seconds. At some point it will just stop. Now you can attach the debugger and should be able to see that one of the mmal components is either waiting for a mmalport* call to finish or is waiting for buffers to be returned. The last case I had where it failed I couldn't even check the mmal internal state because mmal_vc_diag mmal-stats did stall immediately as well.

julianscheel commented 9 years ago

Is the testcase working for you?

6by9 commented 9 years ago

Sorry, not had any time to look at anything Pi related. Hopefully will over the weekend.

julianscheel commented 9 years ago

That would be really great, thank you.

julianscheel commented 9 years ago

Don't want to push, but am wondering if the testcase is allows you to reproduce the issue? Or is there anything more I can provide?

deborah-c commented 9 years ago

I've not had a chance to look at it yet, I'm afraid. We've been quite busy with some new feature work -- see https://github.com/popcornmix/omxplayer/issues/323

Once we've finished up the loose ends on that, I may be able to take a look, if @6by9 doesn't get there first.

julianscheel commented 9 years ago

@deborah-c Thanks for the info - supporting MVC is a nice thing for sure. I'll take a look at that in VLC as well once it's properly done in the firmware. Should be able to support 3D DVB broadcasts then as well. So I hope you get this done soon and can then take a look at the freezes :) - or @6by9 finds some valuable spare time for it. I'm actually out of ideas what I could do on the VLC side anymore to avoid that issue, so I fear I really rely on you looking into what's going on at the GPU side.

6by9 commented 9 years ago

Sorry, nightmare time at both work and home at the moment, and likely to get worse (new baby expected next week!). I did get as far as downloading your image over the weekend, but that was it.

julianscheel commented 9 years ago

@6by9 All the best for you :) To simplify the test setup a little bit I just verified that the issue can be reproduced by playing a file instead of the continous live stream. I've uploaded a ts sample file which contains an aspect ratio switch: http://jusst.de/files/aspect-loop.ts So the full procedure from downloading the tarball to triggering the testcase can be this:

cd /home/pi
mkdir vlc
cd vlc

wget http://jusst.de/files/vlc-raspbian-nodb.tar.xz
tar -xvf vlc-raspbian-nodb.tar.xz

export VLC_PLUGIN_PATH=/home/pi/vlc/vlc-mmal/build-cross/install/usr/lib/vlc/plugins
export LD_LIBRARY_PATH=/home/pi/vlc/vlc-mmal/build-cross/install/usr/lib
export PATH=/home/pi/vlc/vlc-mmal/build-cross/install/usr/bin:$PATH

wget http://jusst.de/files/stream-zap-test.tar.xz
tar -xvf stream-zap-test.tar.xz
cd stream-zap-test

wget http://jusst.de/files/aspect-loop.ts
./stream-loop-test.py aspect-loop.ts

From there on just wait until the playback freezes.

note (2015-03-19): updated paths to work properly

6by9 commented 9 years ago

Have given it a quick whirl but failing so far. Having extracted your tar file, I get a vlc-mmal directory, not vlc-raspbian. Also no install directory under that, so amended to

export VLC_PLUGIN_PATH=/home/pi/vlc/vlc-mmal/build-rpi/install/usr/lib/vlc/plugins
export LD_LIBRARY_PATH=/home/pi/vlc/vlc-mmal/build-rpi/install/usr/lib
export PATH=/home/pi/vlc/vlc-mmal/build-rpi/install/usr/bin:$PATH

(I've put it all in a subdirectory called vlc just to keep my Pi a little tidy)

Fails with:

Traceback (most recent call last):
  File "./stream-loop-test.py", line 6, in <module>
    import vlc
  File "/home/pi/vlc/stream-zap-test/vlc.py", line 164, in <module>
    dll, plugin_path  = find_lib()
  File "/home/pi/vlc/stream-zap-test/vlc.py", line 109, in find_lib
    dll = ctypes.CDLL('libvlc.so.5')
  File "/usr/lib/python2.7/ctypes/__init__.py", line 365, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /lib/arm-linux-gnueabihf/libm.so.6: version `GLIBC_2.15' not found (required by /home/pi/vlc/vlc-mmal/build-rpi/install/usr/lib/libvlccore.so.8)

A quick check isn't finding libc 2.15 in the Raspbian tree - top is 2.13. No time to look further now. What's the best way to get glibc 2.15?

julianscheel commented 9 years ago

I fear I packaged the wrong build. Just let me check that quickly.

julianscheel commented 9 years ago

@6by9 I actually packaged my full build tree, which I didn't want to. Nevertheless it also contains a proper build for raspbian, but that one is in the build-cross subfolder. build-rpi subfolder is linked against my buildroot system.

So the correct environment in your case should be

export VLC_PLUGIN_PATH=/home/pi/vlc/vlc-mmal/build-cross/install/usr/lib/vlc/plugins
export LD_LIBRARY_PATH=/home/pi/vlc/vlc-mmal/build-cross/install/usr/lib
export PATH=/home/pi/vlc/vlc-mmal/build-cross/install/usr/bin:$PATH

Does it work with that one for you? (Keep in mind you need libdvbpsi from jessie to be installed)

6by9 commented 9 years ago

Thanks. I will try again if I get a chance tonight. I had looked at the libdvbpsi thing before - something objected but I can't remember what. Will report back if I can't get it running.

julianscheel commented 9 years ago

Alright, thank you. My apt preferences look like this:

Package: *
Pin: release n=wheezy
Pin-Priority: 900

Package: *
Pin: release n=jessie
Pin-Priority: 300

Package: *
Pin: release o=Raspbian
Pin-Priority: -10

And I installed the new libdvbpsi using:

sudo apt-get install -t jessie libdvbpsi9 libdvbpsi-dev
julianscheel commented 9 years ago

And finally one more note: As we waste quite some images for non-optimal buffering purpose you might have to increase GPU memory. I use 196MB in my setup.

julianscheel commented 9 years ago

Just in case it could be helpful for you, I just got this hung timer backtrace in dmesg on another occurance of the issue:

[90485.346668] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[90485.364192] stream-loop-tes D c03af8f4     0 19905  18765 0x00000004
[90485.380726] Backtrace:
[90485.398705] [<c03af730>] (__schedule) from [<c03afd08>] (schedule+0x38/0x84)
[90485.419876]  r10:00000000 r9:d3533800 r8:c055cbec r7:d20abd9c r6:7fffffff r5:d35ab120
[90485.439394]  r4:d20aa000
[90485.452373] [<c03afcd0>] (schedule) from [<c03b2648>] (schedule_timeout+0x168/0x1b8)
[90485.470090] [<c03b24e0>] (schedule_timeout) from [<c03b1920>] (__down+0x94/0xcc)
[90485.487857]  r10:00000000 r9:d3533800 r8:c055cbec r7:d20abd9c r6:00000002 r5:d35ab120
[90485.506520]  r4:d20aa000
[90485.518857] [<c03b188c>] (__down) from [<c0044dd0>] (down+0x80/0x84)
[90485.535107]  r6:d3542a00 r5:a0000113 r4:d20aa000
[90485.560596] [<c0044d50>] (down) from [<c0259dd8>] (vchiq_release+0x100/0x31c)
[90485.578164]  r5:d3533000 r4:d20aa000
[90485.592283] [<c0259cd8>] (vchiq_release) from [<c00c1eb8>] (__fput+0x88/0x200)
[90485.609736]  r10:00000000 r9:cdccd7a8 r8:d35f90f0 r7:d305ce58 r6:cdccd7a0 r5:00000008
[90485.629432]  r4:d35f7e68
[90485.641559] [<c00c1e30>] (__fput) from [<c00c2088>] (____fput+0x10/0x14)
[90485.657843]  r10:d214d214 r9:418004fc r8:d20aa000 r7:00000000 r6:c052a7d0 r5:d35ab120
[90485.675969]  r4:d35ab49c
[90485.688476] [<c00c2078>] (____fput) from [<c0035bfc>] (task_work_run+0x90/0xc0)
[90485.705464] [<c0035b6c>] (task_work_run) from [<c0020cf4>] (do_exit+0x270/0x8fc)
[90485.722590]  r7:d214d1e0 r6:d35ab120 r5:c054b798 r4:00000002
[90485.739546] [<c0020a84>] (do_exit) from [<c00214b4>] (do_group_exit+0x44/0xd4)
[90485.757326]  r7:d21d2540
[90485.770149] [<c0021470>] (do_group_exit) from [<c002aab0>] (get_signal+0x178/0x60c)
[90485.788125]  r5:d20abedc r4:d20aa000
[90485.802170] [<c002a938>] (get_signal) from [<c03acaec>] (do_signal+0x8c/0x3cc)
[90485.819712]  r10:00000000 r9:d20aa000 r8:b6cf7118 r7:b6cf711c r6:fffffffc r5:00000000
[90485.848988]  r4:d20abfb0
[90485.863295] [<c03aca60>] (do_signal) from [<c0011144>] (do_work_pending+0xa8/0xec)
[90485.882102]  r10:00000000 r8:c000ea44 r7:d20abfb0 r6:d20aa000 r5:d20aa000 r4:b4d4e1c8
[90485.902982] [<c001109c>] (do_work_pending) from [<c000e8e0>] (work_pending+0xc/0x20)
[90485.921384]  r8:c000ea44 r7:00000036 r6:ab5f0b70 r5:ad2f0d20 r4:b4d4e1c8 r3:b4acd9c4
julianscheel commented 9 years ago

@6by9 I've just uploaded a new tarball: http://jusst.de/files/vlc-raspbian-nodb.tar.xz It contains a build without vlc debug enabled, which reduces the amount of debug printing, etc and usually triggers the error case much faster than the debug build does. This time I properly stripped out all unrelated build trees and even the git history to shrink the tarball to a more pleasant size.

julianscheel commented 9 years ago

@deborah-c How are you proceeding with the MVC feature work? Any chances you can take a look on this issue soon? - I fear @6by9 is busy with a new baby? :)

deborah-c commented 9 years ago

We're making progress, but as you'll probably appreciate, getting from "decoder which can decode correct streams correctly" to "decoder that you can't kill with tortured input" takes a while... Since there's also no support in public domain code for MVC, there's quite a bit of work for us to do at container level, too.

6by9 commented 9 years ago

No baby as yet, just mother-in-law around :-/ I did grab your image, but not done anything with it yet.

julianscheel commented 9 years ago

@deborah-c Oh, yes I can fully undestand that :) @6by9 Alright, thanks for the update

Actually I am just trying around with shuffling the points where VLC waits for pictures a bit around, which allowed to have a configuration which seems to work vor a simple codec->vout chain at least. codec->deinterlace->vout is becoming more challenging though. I'll try around a bit more. But it's a lot of guesswork as long as I won't understand what exactly causes the deadlocks.

julianscheel commented 9 years ago

I did a lot more testing in the meantime and it seems a reliable way to kill the firmware (in a way that even calling simple things like vcgencmd version would not work afterwards, but just freeze) is a race between mmal_port_send_buffer and mmal_component_disable (I think mmal_port_disable as well). The component disabled does not have to be the same as the component a port is sent to, but it might be relevant that the buffer which is sent to the port belongs to the component which is to be disabled. The relevant parts of the stack seem to be:

Thread 6 (Thread 0xb22d8460 (LWP 1292)):
#0  0xb6dc204c in do_futex_wait () from /lib/libpthread.so.0
#1  0xb6dc2108 in sem_wait@@GLIBC_2.4 () from /lib/libpthread.so.0
#2  0xb4d3ae78 in vcos_semaphore_wait (sem=0xb4d4d468 <client+36>)
    at /home/julian/dev/nv/buildroot/build/nvr-101/rootfs-debug/build/rpi-userland-6f122099ca3ef1115c740f927a64ccf54ca576ba/build/inc/interface/vcos/vcos_platform.h:254
#3  0xb4d3d184 in mmal_vc_sendwait_message (client=0xb4d4d444 <client>, msg_header=0xb22d7b2c, size=28, msgid=7, 
    dest=0xb22d7b48, destlen=0xb22d7b28, send_dummy_bulk=0)
    at /home/julian/dev/nv/buildroot/build/nvr-101/rootfs-debug/build/rpi-userland-6f122099ca3ef1115c740f927a64ccf54ca---Type <return> to continue, or q <return> to quit---
576ba/interface/mmal/vc/mmal_vc_client.c:593
#4  0xb4d40584 in mmal_vc_component_disable (component=0xb6100600)
    at /home/julian/dev/nv/buildroot/build/nvr-101/rootfs-debug/build/rpi-userland-6f122099ca3ef1115c740f927a64ccf54ca576ba/interface/mmal/vc/mmal_vc_api.c:714
#5  0xb4d76a34 in mmal_component_disable (component=0xb6100600)
    at /home/julian/dev/nv/buildroot/build/nvr-101/rootfs-debug/build/rpi-userland-6f122099ca3ef1115c740f927a64ccf54ca576ba/interface/mmal/core/mmal_component.c:429
#6  0xb4deb134 in CloseDecoder (dec=<error reading variable: value has been optimized out>)
    at ../../../../modules/hw/mmal/codec.c:271

Which is the decoding thread shutting down and thus calling mmal_component_disable. Along with this:

Thread 4 (Thread 0xb62ed460 (LWP 1296)):
#0  0xb6cbd11c in ioctl () from /lib/libc.so.6
#1  0xb4da065c in vchiq_queue_bulk_transmit (handle=2887, data=0xacbf0e50, size=144, userdata=0xadaf5534)
    at /home/julian/dev/nv/buildroot/build/nvr-101/rootfs-debug/build/rpi-userland-6f122099ca3ef1115c740f927a64ccf54ca576ba/interface/vchiq_arm/vchiq_lib.c:412
#2  0xb4d3d518 in mmal_vc_send_message (client=0xb4d4d444 <client>, msg_header=0xadaf5534, size=296, 
    data=0xacbf0e50 "", data_size=144, msgid=11)
    at /home/julian/dev/nv/buildroot/build/nvr-101/rootfs-debug/build/rpi-userland-6f122099ca3ef1115c740f927a64ccf54ca576ba/interface/mmal/vc/mmal_vc_client.c:660
#3  0xb4d40374 in mmal_vc_port_send (port=0xaccefc70, buffer=0xab248f00)
    at /home/julian/dev/nv/buildroot/build/nvr-101/rootfs-debug/build/rpi-userland-6f122099ca3ef1115c740f927a64ccf54ca576ba/interface/mmal/vc/mmal_vc_api.c:690
#4  0xb4d70ddc in mmal_port_send_buffer (port=0xaccefc70, buffer=0xab248f00)
    at /home/julian/dev/nv/buildroot/build/nvr-101/rootfs-debug/build/rpi-userland-6f122099ca3ef1115c740f927a64ccf54ca576ba/interface/mmal/core/mmal_port.c:764
#5  0xb177450c in deinterlace (filter=0xaafcccd0, picture=0xaaff0798)
    at ../../../../modules/hw/mmal/deinterlace.c:484

Which is the filter thread pushing a picture to the image_fx port. For every picture being used I hold a reference to the owning component now, so that it would not be destroyed before the picture is freed - but as the stall happens on mmal_component_disable already I don't think it is an issue with the components lifefycle.

@6by9 @deborah-c @luked99 Do you guys have any thoughts on conditions that have to be avoided on the userspace side wrt the multithreaded access of the components? Or any clues on what might go on in the GPU?

luked99 commented 9 years ago

Does vcdbg have support for reporting logging messages and asserts?

i.e.

$ vcdbg log msg $ vcdbg log assert

One thing I'm puzzled by here is that this is doing a VCHIQ bulk transmit, i.e. copying the pixels. We want to avoid that if we can because (a) copying is bad and (b) bulk transfers have in the past been somewhat problematic. Would it be possible to make the filter->image_fx transfers use VCSM (as per previous discussion on the other thread) by setting the "zero copy" MMAL flag on that port?

julianscheel commented 9 years ago

@luked99 In fact the components are configured to opaque mode, so they should only copy the opaque handles vcdbg works in general, let me see if it works once the system is stalled.

6by9 commented 9 years ago

Release builds so no assert messages. VCSM was only added to the Pi kernel relatively recently (November time IIRC), hence zero copy isn't widely used. Opaque buffers shouldn't need them anyway.