bluenviron / mediamtx

Ready-to-use SRT / WebRTC / RTSP / RTMP / LL-HLS media server and media proxy that allows to read, publish, proxy, record and playback video and audio streams.
MIT License
10.9k stars 1.41k forks source link

rpiCamera source: encoder_encode(): ioctl() failed #1133

Closed giano574 closed 1 year ago

giano574 commented 1 year ago

Which version are you using?

v0.20.0

Which operating system are you using?

Describe the issue

After a while, encoder_encode(): ioctl() failed will appear, and the stream will stop working. rtsp-simple-server is still running, but the stream will never recover. Restarting rtsp-simple-server makes the stream work again until the next ioctl() failed.

Describe how to replicate the issue

I don't think it is relevant that I am using HLS, but this is the configuration that I have used.

  1. Start with default config. Enable HLS and disable other protocols.
  2. Add the following path to the configuration file.
    stream:
    source: rpiCamera
    rpiCameraWidth: 1920
    rpiCameraHeight: 1080
    rpiCameraFPS: 30
    rpiCameraIDRPeriod: 10
    rpiCameraProfile: main
    rpiCameraLevel: '4.1'
    rpiCameraBitrate: 8000000
  3. Play the stream in a browser at http://<ip>:8888/stream.

Did you attach the server logs?

no

Did you attach a network dump?

no

giano574 commented 1 year ago

Looking at the code, I can see that it happens in encoder.c:310 when trying to enqueue a buffer. It would be nice if rtsp-simple-server tried to recover instead of just exiting, which results in the entire process needing to be restarted to start the stream again.

aler9 commented 1 year ago

Hello, i'm willing to add a recovery mechanism to the server, but from my experience ioctl errors appear only when the Raspberry Pi has issues related to power supply or faulty links between the board and the camera, and in this case trying to perform a software recovery is useless, you have to reboot the device.

Are you sure that the error happens normally and is not related to faulty hardware?

giano574 commented 1 year ago

I can't rule out that it's hardware related, but I can say for sure that when it happens, restarting rtsp-simple-server works as a method of recovery. My best guess is that is is related to noise, since it seems to happen more often when multiple Raspberry Pis with cameras attached are placed next to each other.

In libcamera, ioctl() calls are wrapped in xioctl that retries 10 times. Maybe that's a possible (partial) solution? But I think rtsp-simple-server should also try to restart the camera process.

giano574 commented 1 year ago

Example of restart of rtsp-simple-server working that happened right now:

pi@picam:~/rtsp-release $ ./rtsp-simple-server
[22:02:19.837038036] [3271]  INFO Camera camera_manager.cpp:293 libcamera v0.0.0+3730-67300b62
[22:02:19.877092398] [3272]  WARN CameraSensorProperties camera_sensor_properties.cpp:174 No static properties available for 'imx519'
[22:02:19.877138656] [3272]  WARN CameraSensorProperties camera_sensor_properties.cpp:176 Please consider updating the camera sensor properties database
[22:02:19.901427028] [3272]  WARN RPI raspberrypi.cpp:1274 Mismatch between Unicam and CamHelper for embedded data usage!
[22:02:19.901913845] [3272] ERROR DelayedControls delayed_controls.cpp:87 Delay request for control id 0x009a090a but control is not exposed by device /dev/v4l-subdev0
[22:02:19.902157170] [3272]  INFO RPI raspberrypi.cpp:1398 Registered camera /base/soc/i2c0mux/i2c@1/imx519@1a to Unicam device /dev/media3 and ISP device /dev/media0
[22:02:19.903009012] [3271]  INFO Camera camera.cpp:1029 configuring streams: (0) 1920x1080-YUV420 (1) 3840x2160-SGBRG10_CSI2P
[22:02:19.903360649] [3272]  INFO RPI raspberrypi.cpp:763 Sensor: /base/soc/i2c0mux/i2c@1/imx519@1a - Selected sensor format: 3840x2160-SGBRG10_1X10 - Selected unicam format: 3840x2160-pGAA
encoder_encode(): ioctl(VIDIOC_QBUF) failed
^Cpi@picam:~/rtsp-release $ ./rtsp-simple-server
[22:55:01.630182692] [3387]  INFO Camera camera_manager.cpp:293 libcamera v0.0.0+3730-67300b62
[22:55:01.673640553] [3388]  WARN CameraSensorProperties camera_sensor_properties.cpp:174 No static properties available for 'imx519'
[22:55:01.673687534] [3388]  WARN CameraSensorProperties camera_sensor_properties.cpp:176 Please consider updating the camera sensor properties database
[22:55:01.702473714] [3388]  WARN RPI raspberrypi.cpp:1274 Mismatch between Unicam and CamHelper for embedded data usage!
[22:55:01.702981453] [3388] ERROR DelayedControls delayed_controls.cpp:87 Delay request for control id 0x009a090a but control is not exposed by device /dev/v4l-subdev0
[22:55:01.703245489] [3388]  INFO RPI raspberrypi.cpp:1398 Registered camera /base/soc/i2c0mux/i2c@1/imx519@1a to Unicam device /dev/media3 and ISP device /dev/media0
[22:55:01.704083911] [3387]  INFO Camera camera.cpp:1029 configuring streams: (0) 1920x1080-YUV420 (1) 3840x2160-SGBRG10_CSI2P
[22:55:01.704494076] [3388]  INFO RPI raspberrypi.cpp:763 Sensor: /base/soc/i2c0mux/i2c@1/imx519@1a - Selected sensor format: 3840x2160-SGBRG10_1X10 - Selected unicam format: 3840x2160-pGAA

After encoder_encode(): ioctl(VIDIOC_QBUF) failed no more segments were added to stream.m3u8 (of course). After stopping and starting rtsp-simple-server, segments are produced as normal.

aler9 commented 1 year ago

I also encountered this issue randomly, the problem is that it's really difficult to replicate. We can either increase buffer_count or add a wrapper around ioctl(), i'm trying to test both solutions.

aler9 commented 1 year ago

Since v0.20.2, the camera reader doesn't freeze anymore in case of ioctl() errors. If this doesn't solve the problem, feel free to reopen the issue.

giano574 commented 1 year ago

I think this creates a memory leak. I tried running it without exit() and when a lot of ioctl errors happened, memory usage exploded.

aler9 commented 1 year ago

@giano574 that ioctl() call tries to put a frame buffer into the encoder queue. It usually fails when the encoder queue is full or there's not enough memory. It doesn't allocate anything, and generally, there's nothing allocated dynamically in the camera module, everything is allocated once, during startup.

The memory probably explodes due to the fact that video is corrupt after the ioctl() fail and HLS segments grow in size until they reach hlsSegmentMaxSize.

In order to debug memory leaks i usually set pprof: yes in the configuration file and then generate a report of the memory usage with

docker run --rm -it --network=host golang:1.15 go tool pprof -text http://localhost:9999/debug/pprof/goroutine
giano574 commented 1 year ago

I haven't been able to reproduce it lately, and I also don't know what to look for the the pprof reports. Do you think anything can be done in order for the memory usage to not increase when this happens?

aler9 commented 1 year ago

yes, decrease hlsSegmentMaxSize

giano574 commented 1 year ago

Maybe I don't understand, but why does the segment grow in size when queueing fails? Should the frames that couldn't be encoded not just be ignored?

And what will happen if hlsSegmentMaxSize is exceeded? Will the segment be completed? Discarded? Something else?

aler9 commented 1 year ago

segments grow in size since segments are splitted by using IDR frames, as a segments must contain at least a IDR frame. If the stream gets corrupted, IDR frames are lost and segments grow until they reach hlsSegmentMaxSize, then they are forcefully completed.

giano574 commented 1 year ago

Ah, I see. So the actual frames that aren't encoded are just lost, but then no new IDR's are added to the segment? So non-IDR frames are dequeued and put into the segment until the maximum size is reached? Does this mean that no new IDR frames will ever be generated again, or can/does it recover?

giano574 commented 1 year ago

If I set hlsSegmentMaxSize and hlsSegmentCount so that hlsSegmentMaxSize * (hlsSegmentCount + 1) < available RAM - some margin then it shouldn't be possible to exhaust the RAM, correct?

yllekz commented 1 year ago

This is happening to me with a Pi camera as well. I'm using almost entirely default settings for the Raspberry Pi Camera other than setting rpiCameraVFlip to true (which doesn't currently work)

alexfornuto commented 1 year ago

This happens to me on my Pi Zero 1 without HLS, but I get that it's a low-memory device. In order to try to adjust, I'd like to ask: is it system or GPU RAM that affects this issue? Based on the answer, I'll reduce/expand the memory allocated to GPU to try to resolve.

aler9 commented 1 year ago

@alexfornuto from my experience it's the system RAM. The camera also need the GPU RAM but in a fixed quantity, if you perform a quick search you'll find the minimum GPU needed to start the camera. Use that and give the remaining RAM to the system.

github-actions[bot] commented 9 months ago

This issue is being locked automatically because it has been closed for more than 6 months. Please open a new issue in case you encounter a similar problem.