Open alexsee75 opened 5 years ago
PR #580 should fix this.
I've tested the patch under fully updated latest Raspbian jessie, stretch and buster on a ComputeModule 3 by cloning git repository and running ./buildme. Usage of raspivid* without -3d works perfectly.
However, when using the -3d sbs option for raspividyuv or raspivid, the resp. program freezes after sending a small number of frames (in some cases none or a single frame, but the received frame (if any) looks good and shows both cameras) and Ctrl+C does not stop the program (mmal: Aborting program, however the program does not abort). Multiple Ctrl+C after the message show no additional message and don't stop the program. All this works without the -3d option. A reboot is necessary to make the cameras, vcgencmd, etc.. usable again. ps shows raspivid* as [defunct]. However the same also happens with raspivid using H264 output so this bug might predate these changes.
During freeze no messages are visible in dmesg, syslog or messages. However after trying to abort the program with Ctrl+C I get this output in dmesg (after about two minutes). It might show where it hangs.
[243.674598] INFO: task kworker/0:2:133 blocked for more than 120 seconds. [ 243.674611] Tainted: G C 4.19.66-v7+ #1253 [ 243.674616] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 243.674624] kworker/0:2 D 0 133 2 0x00000000 [ 243.674659] Workqueue: events get_values_poll [raspberrypi_hwmon] [ 243.674697] [<80830934>] (__schedule) from [<80830fa4>] (schedule+0x50/0xa8) [ 243.674715] [<80830fa4>] (schedule) from [<80834f8c>] (schedule_timeout+0x200/0x428) [ 243.674734] [<80834f8c>] (schedule_timeout) from [<80831c14>] (wait_for_common+0xd4/0x1b0) [ 243.674753] [<80831c14>] (wait_for_common) from [<80831d10>] (wait_for_completion+0x20/0x24) [ 243.674774] [<80831d10>] (wait_for_completion) from [<806ab408>] (rpi_firmware_transaction+0x78/0xd0) [ 243.674794] [<806ab408>] (rpi_firmware_transaction) from [<806ab5a0>] (rpi_firmware_property_list+0x140/0x2a8) [ 243.674812] [<806ab5a0>] (rpi_firmware_property_list) from [<806ab784>] (rpi_firmware_property+0x7c/0xfc) [ 243.674833] [<806ab784>] (rpi_firmware_property) from [<7f2590c0>] (get_values_poll+0x4c/0x15c [raspberrypi_hwmon]) [ 243.674868] [<7f2590c0>] (get_values_poll [raspberrypi_hwmon]) from [<8013bf0c>] (process_one_work+0x170/0x458) [ 243.674886] [<8013bf0c>] (process_one_work) from [<8013c250>] (worker_thread+0x5c/0x5a4) [ 243.674904] [<8013c250>] (worker_thread) from [<8014253c>] (kthread+0x138/0x168) [ 243.674923] [<8014253c>] (kthread) from [<801010ac>] (ret_from_fork+0x14/0x28) [ 243.674931] Exception stack(0xb34bbfb0 to 0xb34bbff8) [ 243.674941] bfa0: 00000000 00000000 00000000 00000000 [ 243.674954] bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 243.674965] bfe0: 00000000 00000000 00000000 0000000
If you need any additional information, please let me know.
The stack trace just says that the GPU has stopped responding to mailbox calls.
However the same also happens with raspivid using H264 output so this bug might predate these changes.
I suspect so. There is an open issue at https://github.com/raspberrypi/firmware/issues/1253 saying that stereoscopic has a regression, but none of us have had a chance to dig into it recently.
Daft thought on that though - the AWB changes were in about the right timeframe for that regression. Could you try running sudo vcdbg set awb_mode 0
BEFORE running raspivid[yuv] and see if it is happy? That reverts to the older AWB algorithm and would be a useful data point.
I've patched an older RaspiVidYUV ending at commit
359099dc38a0fe41e71dee56878e758912b2220c
and 3d mode works perfectly! So the bug must be somewhere between there and HEAD.
So, success! Perhaps we should close this and I'll add the awb_mode test to firmware#1253?
Going back to 359099dc38a0fe4 in userland fixes it, even with the latest firmware? Shame that's gone all the way back to Nov 2017. Can you try on some intermediate commits to try and isolate where things have gone wrong? There are very few changes that look plausible. The moving of stuff into common functions from Nov/Dec 2018 are the ones I'd be looking at first.
I'm not sure. Doesn't ./buildme not install the then-current (old) firmware as well? So I need to run cmake manually and don't make install
and run the binaries directly?
Yes, I'll try some more recent commits. This was the one we used in 2017 for a project and were known to work...
./buildme
updates the userland libraries only, not the firmware or kernel.
Firmware version can be checked with vcgencmd version
.
Kernel version can be checked with uname -a
Sorry, my mistake: it was only fixed with old user libraries and old firmware. The old user libraries (from 2017) plus new(er) firmware show the same bug.
Aug 15 2019 12:08:48 Copyright (c) 2012 Broadcom version 0e6daa5106dd4164474616408e0dc24f997ffcf3 (clean) (release) (start_x)
Linux raspberrypi 4.19.66-v7+ #1253 SMP Thu Aug 15 11:49:46 BST 2019 armv7l GNU/Linux
However sudo vcdbg set awb_mode 0
fixes it for the old user libraries. So perhaps that can also be used as workaround for the new user libraries.
However sudo vcdbg set awb_mode 0 fixes it for the old user libraries. So perhaps that can also be used as workaround for the new user libraries.
Is that old user libraries plus new firmware? In which case that does seem to have narrowed down the regression to initially the AWB algorithm, and then potentially the userspace apps as well.
So far I can confirm after several tests: user libraries 359099d plus firmware...
Aug 15 2019 12:08:48 Copyright (c) 2012 Broadcom version 0e6daa5106dd4164474616408e0dc24f997ffcf3 (clean) (release) (start_x)
...plus kernel...
Linux raspberrypi 4.19.66-v7+ #1253 SMP Thu Aug 15 11:49:46 BST 2019 armv7l GNU/Linux
...works when running vcdbg set awb_mode 0
after reboot before raspividyuv/raspivid, and shows the same bug when not running vcdbg before raspividyuv.
Comment just left on #1253 that selecting the old AWB algorithm makes things work for them too. Now to dig into the AWB algorithm again....
@6by9 Is there any update on this AWB algo fix?
Sorry, it's not made it to the top of the priorities list for any of us as yet.
It seems that RaspiVidYUV.cpp and RaspiStillYUV.cpp silenty ignore requests for stereoscopic output (-3d sbs, -3b tb) and instead only output image data from a single camera. This is probably because stereo_mode is not setup in the code (as in RaspiVid and RaspiStill)
Please add stereoscopic support to RaspiVidYUV and RaspiStillYUV!