Motion-Project / motion

Motion, a software motion detector. Home page: https://motion-project.github.io/
GNU General Public License v2.0
3.67k stars 549 forks source link

calling avcodec_send_frame locks up with ffmpeg h264_omx encoder #433

Closed jasaw closed 7 years ago

jasaw commented 7 years ago
  1. motion version 280141f4178f2d656a2af4c5b3d3c430f41646a1, ffmpeg release 3.3.2.
  2. compiled from source.
  3. ran as part of MotionEye initially, then used config generated by MotionEye to run motion as standalone.
  4. tested with mmal and v4l2.
  5. ARM (only reproducible on multiple models of Raspberry Pi).
  6. tested both Raspbian & MotionEyeOS.
  7. GPU firmware version: 3202f1b16896029f9da1b074b0912177e8960b52

How to trigger avcodec_send_frame lock up:

Is it locking up because of how motion interfaces with libav? Anyone has any clue as to what might be happening?

I've checked the GPU status, and everything was OK.

vcgencmd get_mem <type>
Where type is:
arm: total memory assigned to arm
gpu: total memory assigned to gpu
malloc_total: total memory assigned to gpu malloc heap
malloc: free gpu memory in malloc heap
reloc_total: total memory assigned to gpu relocatable heap
reloc: free gpu memory in relocatable heap

vcgencmd get_throttled
0: under-voltage
1: arm frequency capped
2: currently throttled
16: under-voltage has occurred
17: arm frequency capped has occurred
18: throttling has occurred

Config file from my Raspbian system below.

thread-1.conf

ffmpeg_output_movies on
height 720
stream_quality 5
threshold 28800
quality 85
noise_level 31
ffmpeg_output_debug_movies off
pre_capture 1
noise_tune on
smart_mask_speed 0
stream_maxrate 1
output_pictures off
hue 0
saturation 0
stream_localhost off
ffmpeg_variable_bitrate 75
ffmpeg_video_codec mp4
text_changes off
movie_filename %Y-%m-%d/%H-%M-%S
auto_brightness off
stream_port 8081
rotate 0
brightness 0
lightswitch 0
framerate 20
emulate_motion off
snapshot_filename
despeckle_filter
snapshot_interval 0
stream_auth_method 0
stream_motion off
target_dir /var/lib/motioneye/Camera1
text_double on
post_capture 100
stream_authentication user:
output_debug_pictures off
on_picture_save /usr/local/lib/python2.7/dist-packages/motioneye/scripts/relayevent.sh "/etc/motioneye/motioneye.conf" picture_save %t %f
on_movie_end /usr/local/lib/python2.7/dist-packages/motioneye/scripts/relayevent.sh "/etc/motioneye/motioneye.conf" movie_end %t %f
text_left Camera1
picture_filename
locate_motion_style redbox
locate_motion_mode off
contrast 0
videodevice /dev/video0
max_movie_time 0
on_event_end /usr/local/lib/python2.7/dist-packages/motioneye/scripts/relayevent.sh "/etc/motioneye/motioneye.conf" stop %t
text_right %Y-%m-%d\n%T
on_event_start /usr/local/lib/python2.7/dist-packages/motioneye/scripts/relayevent.sh "/etc/motioneye/motioneye.conf" start %t
event_gap 30
minimum_motion_frames 15
mask_file
width 1280
Mr-Dave commented 7 years ago

You are the first person I have heard of using the omx encoder with ffmpeg. It really isn't part of our support / testing cycle. As such, I am not sure of the resolution. To break the deadlock, a ffmpeg callback /interrupt process function would need to be added like what exists within the netcam_rtsp.

jasaw commented 7 years ago

I've been trying to debug this for a few days and got no where. What I don't understand is, it works flawlessly at 800 x 600 resolution. I've been running it for a few weeks. Works at 1920 x 1080 resolution too, but didn't get as much testing. At 1280 x 720 resolution, it locks up. I ran it with strace -f, but did not see anything obvious. vcdbg log didn't show any error too. What else can I try?

jogu commented 7 years ago

If the problem is a thread is getting stuck, wait for this to happen, then run gdb -p Then 'thread apply all backtrace' (I may have misremembered syntax slightly) and paste the output into a gist/pastebin and share URL here.

jasaw commented 7 years ago

gdb thread apply all backtrace output here but gdb hit an internal-error.

I did more debugging and found that it's actually stuck at OMX_EmptyThisBuffer function in ffmpeg libavcodec/omx.c. sudo vcdbg log assert did not show anything useful.

jogu commented 7 years ago

It may well be that you're suffering from an ffmpeg or kernel/GPU driver issue that you will need help from the raspberry pi people with. You should also make sure your PSU is sufficient (ie. is an official PSU for that model of PI which has at least the required current capacity).

You could possibly try a debug build of motion to see if gdb is happier with that; if it's not then I suspect your system has a broken gdb or compiler and it's going to be difficult to get more info.

jasaw commented 7 years ago

I can confirm that it's definitely not a power supply issue because I'm seeing it with different models of Raspberry Pis with different power supplies and official power supply too. Also vcgencmd get_throttled stayed zero during the entire test, which means not a power supply issue.

I'm running latest Raspbian Jessie, and I'm disappointed that it came with a broken gdb.

I'm not sure if I'm hitting an ffmpeg or GPU firmware bug, or motion is not using the ffmpeg API "correctly". I'll debug further, see what else I can find.

jasaw commented 7 years ago

Running motion with extpipe to ffmpeg h264_omx is stable at various resolutions. motion --> extpipe --> ffmpeg (h264_omx encoder)

extpipe config:

use_extpipe on
extpipe ffmpeg -y -f rawvideo -pix_fmt yuv420p -video_size %wx%h -framerate %fps -i pipe:0 -c:v h264_omx -profile:v high -b:v 3000000 -f mp4 %f.mp4

Running motion encoding via ffmpeg C API hangs at 1280 x 720 resolution. Exact same encoder configuration (bitrate, profile, etc...) as extpipe version. motion --> ffmpeg API (h264_omx encoder)

Edit: Another interesting observation is the extpipe version achieves higher frame rate than API version.

jasaw commented 7 years ago

Taking out the input_zerocopy from ffmpeg libavcodec/omx.c : omx_encode_init function seems to stop the locking up issue.

#if CONFIG_OMX_RPI
    s->input_zerocopy = 1;
#endif
tosiara commented 7 years ago

Did you report that to ffmpeg?

jasaw commented 7 years ago

Not yet. I haven't figured out exactly who's fault it is, motion or ffmpeg. Need to find out who's supposed to manage what buffer, when can it be freed, that sort of thing.

If someone has ffmpeg knowledge, please chime in.

jasaw commented 7 years ago

It appears that whenever the zerocopy condition in omx.c is satisfied (contiguous planes and stride alignment), the call to OMX_EmptyThisBuffer hangs. 1280 x 720 resolution images happen to meet that condition.

Still don't know why it hangs.

tosiara commented 7 years ago

Can you share your exact patch how do you force h264_omx encoder? So I can try to reproduce it

jasaw commented 7 years ago

You'll need to compile ffmpeg with omx-rpi enabled. I added this to ffmpeg config --enable-omx --enable-omx-rpi --enable-mmal.

motion patch to choose h264_omx encoder.

diff --git a/ffmpeg.c b/ffmpeg.c
index 71685a1..07ce41c 100644
--- a/ffmpeg.c
+++ b/ffmpeg.c
@@ -485,7 +485,13 @@ static int ffmpeg_set_codec(struct ffmpeg *ffmpeg){
     char errstr[128];
     int chkrate;

-    ffmpeg->codec = avcodec_find_encoder(ffmpeg->oc->oformat->video_codec);
+    ffmpeg->codec = NULL;
+    if (ffmpeg->oc->oformat->video_codec == AV_CODEC_ID_H264)
+        ffmpeg->codec = avcodec_find_encoder_by_name("h264_omx");
+    else if (ffmpeg->oc->oformat->video_codec == AV_CODEC_ID_MPEG4)
+        ffmpeg->codec = avcodec_find_encoder_by_name("mpeg4_omx");
+    if (!ffmpeg->codec)
+        ffmpeg->codec = avcodec_find_encoder(ffmpeg->oc->oformat->video_codec);
     if (!ffmpeg->codec) {
         MOTION_LOG(ERR, TYPE_ENCODER, NO_ERRNO, "Codec %s not found", ffmpeg->codec_name);
         ffmpeg_free_context(ffmpeg);
jasaw commented 7 years ago

I got some info from 6by9 (one of the Raspberry Pi guys). This explains why ffmpeg needs to copy frame, therefore avoid the locking up issue at 800 x 600 and 1920 x 1080 resolutions.

800x600 - 600 is not multiple of 16 for nSliceHeight (608 is), so needs a copy. 1920x1080 - 1080 is again not multiple of 16 for nSliceHeight (1088 is), so needs a copy.

I haven't got time to investigate further exactly why it locks up. Maybe ffmpeg gave it bad pointer in the buffer header.

tosiara commented 7 years ago

Does it only happen on Rpi? Is it reproducible on x86 with USB web cam?

tosiara commented 7 years ago

Unfortunately, I can't test your patch on OrangePi Zero and ffmpeg compiled with --enable-omx:

[1:ml1] [NTC] [EVT] event_new_video: Source FPS 9
[1:ml1] [ERR] [ENC] ffmpeg_set_codec: Could not open codec Encoder not found
[1:ml1] [ERR] [NET] ffmpeg_open: Failed to allocate codec!
[1:ml1] [ERR] [EVT] event_ffmpeg_newfile: ffopen_open error creating (new) file [/home/motion/01-20170817090721.mp4]
tosiara commented 7 years ago

Ok, I had also to add --enable-libx264 which was somehow missing. Now, my motion was running for 10 minutes fine, recorded a valid mp4, no lock up or any issue 1280x720 10fps YUV, if that matters

jasaw commented 7 years ago

I only see the problem on raspberry Pi.

tosiara commented 7 years ago

Ok, I have updated your initial report that the issue is only reproducible on Raspberry Pi

jasaw commented 7 years ago

Lockup issue reported as https://github.com/raspberrypi/firmware/issues/851.

Mr-Dave commented 7 years ago

Closing as problem upstream