raspberrypi / picamera2

New libcamera based python library
BSD 2-Clause "Simplified" License
878 stars 184 forks source link

[BUG] Unable to re-open/start camera once stop or close is called #130

Closed apage224 closed 2 years ago

apage224 commented 2 years ago

Describe the bug I am unable to cleanly stop or close the camera connection. Once I do any combination of stop or close, I get an error on subsequent opens. I have to reboot in order to recover.

To Reproduce

Below is a slimmed down version based on the capture_headless.py example:

from picamera2 import Picamera2

picam2 = Picamera2()
config = picam2.still_configuration()
picam2.configure(config)

picam2.start() 
picam2.capture_file("demo1.jpg")
picam2.stop()

picam2.start() # <- This will raise an exception
picam2.capture_file("demo2.jpg")
picam2.stop()

Expected behaviour To cleanly stop and close camera connection.

Console Output, Screenshots

Below I've included the stack trace. If I try to restart Python and create new Picamera instance I get a similar error when trying to configure ( picam2.configure(config) ) getting error ERROR RPI raspberrypi.cpp:873 Failed to set format on ISP Output1: -22. It seems like comm. to the camera isn't working?


[0:06:31.111655590] [74]  INFO Camera camera_manager.cpp:293 libcamera v0.0.0+3544-22656360
[0:06:31.145453871] [92]  WARN CameraSensorProperties camera_sensor_properties.cpp:163 No static properties available for 'imx477'
[0:06:31.145495092] [92]  WARN CameraSensorProperties camera_sensor_properties.cpp:165 Please consider updating the camera sensor properties database
[0:06:31.145516166] [92] ERROR CameraSensor camera_sensor.cpp:591 'imx477 10-001a': Camera sensor does not support test pattern modes.
[0:06:31.172111369] [92]  INFO RPI raspberrypi.cpp:1356 Registered camera /base/soc/i2c0mux/i2c@1/imx477@1a to Unicam device /dev/media0 and ISP device /dev/media1
[0:06:31.176780565] [74]  INFO Camera camera.cpp:1029 configuring streams: (0) 4032x3040-BGR888
[0:06:31.177196113] [92]  INFO RPI raspberrypi.cpp:760 Sensor: /base/soc/i2c0mux/i2c@1/imx477@1a - Selected sensor format: 4056x3040-SBGGR12_1X12 - Selected unicam format: 4056x3040-pBCC
[0:06:33.696041744] [92] ERROR V4L2 v4l2_videodevice.cpp:1852 /dev/video15[70:cap]: Failed to start streaming: Invalid argument
2022-05-20T19:46:55.308Z | ERROR    | Camera did not start properly.
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Input In [2], in <cell line: 9>()
      6 picam2.capture_file("demo1.jpg")
      7 picam2.stop()
----> 9 picam2.start()
     10 picam2.capture_file("demo2.jpg")
     11 picam2.stop()

File /usr/local/lib/python3.9/dist-packages/picamera2/picamera2.py:570, in Picamera2.start(self, event_loop)
    568 if event_loop and not self.have_event_loop:
    569     self.start_preview(Preview.NULL)
--> 570 self.start_()

File /usr/local/lib/python3.9/dist-packages/picamera2/picamera2.py:551, in Picamera2.start_(self)
    549 else:
    550     self.log.error("Camera did not start properly.")
--> 551     raise RuntimeError("Camera did not start properly.")

RuntimeError: Camera did not start properly.

Hardware : Raspberry Pi 4 B w/ 8GB connected to Raspberry HQ Camera (IMX477).

davidplowman commented 2 years ago

Hi, thanks for reporting this. I've tried to reproduce the problem on a clean Bullseye install (both 32 and 64 bit) but so far without success. Perhaps we can investigate a few more things:

I've certainly seen cameras fail to start when it runs out of CMA memory. This heap is prone to fragmentation under some circumstances, though normally I believe the error message does suggest this. It might be worth trying to increase the amount of CMA that you have. To do this go to your /boot/config.txt, find the line that says dtoverlay=vc4-kms-v3d and replace it by (exactly as given) dtoverlay=vc4-kms-v3d,cma-384 (and you can check it worked after rebooting with grep Cma /proc/meminfo). But as I've said, I'm not quite convinced that's the story here.

Thanks!

apage224 commented 2 years ago

Thanks for detailed follow up. It's most likely some driver / OS issue on my end. I wanted a sanity check before digging further.

I am running Balena OS (64-bit yocto based) and running everything inside of a Docker container. Updating everything to the latest version(s), if I do both a stop() and close() followed by re-instantiating (Picamera2() ) everything works fine. I ran entire process 1,000 times w/o hiccup. However, if I simply do a stop() then start() I get the same issue as above (immediately after boot or an hour after).

Follow up answers

Are you using an up-to-date Bullseye? (Maybe report the output of uname -a and vcgencmd version.)

The base docker image is pointing to Bullseye 11.2. I'll see if I can find an 11.3 version.

uname -a

Linux f6110a14c5d3 5.10.95-v8 #1 SMP PREEMPT Thu Feb 17 11:43:01 UTC 2022 aarch64 GNU/Linux

vcgencmd version

Jan 20 2022 13:57:40 
Copyright (c) 2012 Broadcom
version bd88f66f8952d34e4e0613a85c7a6d3da49e13e2 (clean) (release) (start_x)

Are there any changes in your /boot/config.txt from the default settings?

cat /boot/config.txt

start_x=1
dtoverlay=vc4-kms-v3d,cma-384
dtoverlay=imx477
dtparam=i2c_arm=on
dtparam=spi=on
dtparam=audio=on
gpu_mem=128
arm_64bit=1
avoid_warnings=1
disable_splash=1

Is there anything else significant running when the failures occur?

I do have chromium running directly to frame buffer as the goal is to run this as a standalone kiosk. There was no difference when repeated with this chromium Docker container stopped and removed. Otherwise not much is running other than Jupyter environment for testing out Picamera2.

If possible, it would be interesting to see the output of cat /proc/meminfo before you run the failing script.

Before running the script I get (grep Cma /proc/meminfo):

CmaTotal:         393216 kB
CmaFree:          382196 kB

After running the script I get the following (seems like adequate memory is left for camera):

CmaTotal:         393216 kB
CmaFree:          231260 kB

When the problem occurs, could you look at the bottom of the output from dmesg...

Here is contents of dmesg cleared right before running script.

[ 1010.765820] bcm2835_mmal_vchiq: buffer_from_host: msg_context not allocated, buf 00000000ca05209d
[ 1010.765844] bcm2835-isp bcm2835-isp: bcm2835_isp_node_start_streaming: Failed enabling port, ret -22
[ 1010.765919] ------------[ cut here ]------------
[ 1010.765945] WARNING: CPU: 1 PID: 4556 at drivers/media/common/videobuf2/videobuf2-core.c:1548 vb2_start_streaming+0xd4/0x130 [videobuf2_common]
[ 1010.765950] Modules linked in: ip6t_REJECT nf_reject_ipv6 ip6table_filter ip6_tables sg ipt_REJECT nf_reject_ipv4 xt_state nf_nat_h323 nf_conntrack_h323 nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_sip nf_conntrack_sip nf_nat_irc nf_conntrack_irc cmac nf_nat_ftp nf_conntrack_ftp algif_hash aes_arm64 algif_skcipher af_alg bnep hci_uart btbcm bluetooth ecdh_generic ecc xt_MASQUERADE nf_conntrack_netlink nfnetlink br_netfilter xt_owner i2c_dev imx477 spidev brcmfmac brcmutil v3d cfg80211 gpu_sched i2c_mux_pinctrl raspberrypi_hwmon i2c_mux i2c_brcmstb rfkill bcm2835_unicam v4l2_dv_timings v4l2_fwnode spi_bcm2835 bcm2835_v4l2(C) bcm2835_codec(C) bcm2835_isp(C) bcm2835_mmal_vchiq(C) v4l2_mem2mem videobuf2_vmalloc videobuf2_dma_contig videobuf2_memops videobuf2_v4l2 videobuf2_common videodev mc vc_sm_cma(C) snd_bcm2835(C) rpivid_mem uio_pdrv_genirq uio nvmem_rmem sch_fq_codel fuse
[ 1010.766212] CPU: 1 PID: 4556 Comm: python3 Tainted: G         C        5.10.95-v8 #1
[ 1010.766217] Hardware name: Raspberry Pi 4 Model B Rev 1.4 (DT)
[ 1010.766226] pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--)
[ 1010.766239] pc : vb2_start_streaming+0xd4/0x130 [videobuf2_common]
[ 1010.766251] lr : vb2_start_streaming+0x8c/0x130 [videobuf2_common]
[ 1010.766255] sp : ffffffc015da3b30
[ 1010.766260] x29: ffffffc015da3b30 x28: ffffffc008c61858 
[ 1010.766272] x27: ffffff8121456700 x26: 0000000000000000 
[ 1010.766283] x25: ffffff8101869680 x24: 0000000000000001 
[ 1010.766293] x23: ffffffc015da3d18 x22: ffffffc008bc2f00 
[ 1010.766304] x21: ffffff8101869e48 x20: 00000000ffffffea 
[ 1010.766315] x19: ffffff8101869c70 x18: 0000000000000000 
[ 1010.766325] x17: 0000000000000000 x16: 0000000000000000 
[ 1010.766335] x15: 000000000000000a x14: 3338326d6362203a 
[ 1010.766346] x13: 7073692d35333832 x12: 74726174735f6564 
[ 1010.766356] x11: 6f6e5f7073695f35 x10: 6c696146203a676e 
[ 1010.766367] x9 : ffffffc0100d65d0 x8 : 74726f7020676e69 
[ 1010.766378] x7 : 6c62616e65206465 x6 : ffffffc011f84610 
[ 1010.766388] x5 : c0000000ffffefff x4 : 0000000000000000 
[ 1010.766399] x3 : 0000000000000000 x2 : 0000000000000000 
[ 1010.766409] x1 : ffffff8102db8000 x0 : 0000000000000001 
[ 1010.766420] Call trace:
[ 1010.766433]  vb2_start_streaming+0xd4/0x130 [videobuf2_common]
[ 1010.766445]  vb2_core_streamon+0x180/0x18c [videobuf2_common]
[ 1010.766457]  vb2_streamon+0x70/0x78 [videobuf2_v4l2]
[ 1010.766467]  vb2_ioctl_streamon+0x50/0x6c [videobuf2_v4l2]
[ 1010.766504]  v4l_streamon+0x30/0x3c [videodev]
[ 1010.766533]  __video_do_ioctl+0x258/0x384 [videodev]
[ 1010.766562]  video_usercopy+0x218/0x4a4 [videodev]
[ 1010.766590]  video_ioctl2+0x24/0x3c [videodev]
[ 1010.766618]  v4l2_ioctl+0x54/0x7c [videodev]
[ 1010.766631]  vfs_ioctl+0x34/0x54
[ 1010.766638]  __arm64_sys_ioctl+0x84/0xdc
[ 1010.766646]  el0_svc_common.constprop.0+0x134/0x1f0
[ 1010.766652]  do_el0_svc+0x58/0x88
[ 1010.766659]  el0_svc+0x20/0x30
[ 1010.766665]  el0_sync_handler+0xbc/0x158
[ 1010.766672]  el0_sync+0x180/0x1c0
[ 1010.766678] ---[ end trace 72eecb4205199119 ]---

In your script, if you replace still_configuration by preview_configuration, does that still fail? That would rule out (or not) big memory allocations.

I tested this and still have same issue. If I stop then start I see similar issue. Only fully closing works.

If you are able to (I realise this takes a bit of time and relies on having some spare SD cards around), could you try a fresh Bullseye image too?

I will look to do this as a next step. I will run everything straight up on a fresh Bullseye image. If it works, then I'll run my same Docker image. That should hopefully narrow it down.

davidplowman commented 2 years ago

Hi, is there still anything we can help you with here? If there is please let us know, otherwise I'll consider closing this issue shortly.

davidplowman commented 2 years ago

I'm going to close this issue for now. If something else comes up, or there's something more to look at, please feel free to open a new issue. Thanks!

optimuspaul commented 2 years ago

@apage224 can you share you DockerFIle? I've been trying to get it installed in a container with no luck.

tmihajloski commented 1 year ago

Hi,

I am experiencing this issue as well. I have a python script that streams to an RTSP server from the camera for 10s then closes the camera. When I try to run the script again I get the error: ERROR V4L2 v4l2_videodevice.cpp:1047 /dev/video13[18:out]: Unable to set format: Invalid argument

I can resolve this error by rebooting the Pi.

Here is my script:

#!/usr/bin/python3

import socket
import time
from pprint import *
from picamera2 import Picamera2
from picamera2.encoders import H264Encoder
from picamera2.encoders import MJPEGEncoder
from picamera2.outputs import FfmpegOutput
from picamera2.outputs import FileOutput
from libcamera import controls

picam2 = Picamera2()

try:
    video_config = picam2.create_video_configuration({"size": (1920, 1080)})

    picam2.configure(video_config)
    picam2.set_controls({"AfMode": controls.AfModeEnum.Continuous})
    encoder = H264Encoder(bitrate=1000000, repeat=True, iperiod=15)
    #encoder = MJPEGEncoder()

    output = FfmpegOutput("-f rtsp -rtsp_transport tcp rtsp://localhost:8554/live.stream")

    print("start streaming")
    picam2.start_encoder(encoder=encoder, output=output)
    picam2.start()
    time.sleep(10)

finally:
    print("closing camera")
    picam2.stop()
    picam2.stop_encoder()
    picam2.close()
davidplowman commented 1 year ago

Hi, I'm afraid I don't have any kind of RTSP server installed so I don't seem to be able to try your example. However, I've tried outputting to a simple udp stream instead, and I am able to restart the script without problems. Perhaps you could try some investigations:

Thanks!

tmihajloski commented 1 year ago

Thanks @davidplowman

I used the MediaMTX server, which works out of the box (https://github.com/bluenviron/mediamtx/releases).

Here is the normal output

[0:01:50.661178358] [2230]  INFO Camera camera_manager.cpp:297 libcamera v0.0.5+83-bde9b04f
[0:01:50.783340397] [2231]  INFO RPI vc4.cpp:437 Registered camera /base/soc/i2c0mux/i2c@1/imx708@1a to Unicam device /dev/media1 and ISP device /dev/media2
[0:01:50.783436559] [2231]  INFO RPI pipeline_base.cpp:1101 Using configuration file '/usr/share/libcamera/pipeline/rpi/vc4/rpi_apps.yaml'
[0:01:50.794932359] [2230]  INFO Camera camera.cpp:1033 configuring streams: (0) 1920x1080-XBGR8888 (1) 2304x1296-SBGGR10_CSI2P
[0:01:50.795580257] [2231]  INFO RPI vc4.cpp:565 Sensor: /base/soc/i2c0mux/i2c@1/imx708@1a - Selected sensor format: 2304x1296-SBGGR10_1X10 - Selected unicam format: 2304x1296-pBAA
start streaming
closing camera

Here is the failed output

[0:02:27.324638307] [2379]  INFO Camera camera_manager.cpp:297 libcamera v0.0.5+83-bde9b04f
[0:02:27.477156489] [2380]  INFO RPI vc4.cpp:437 Registered camera /base/soc/i2c0mux/i2c@1/imx708@1a to Unicam device /dev/media1 and ISP device /dev/media2
[0:02:27.477726746] [2380]  INFO RPI pipeline_base.cpp:1101 Using configuration file '/usr/share/libcamera/pipeline/rpi/vc4/rpi_apps.yaml'
[0:02:27.489750942] [2379]  INFO Camera camera.cpp:1033 configuring streams: (0) 1920x1080-XBGR8888 (1) 2304x1296-SBGGR10_CSI2P
[0:02:30.492098441] [2380] ERROR V4L2 v4l2_videodevice.cpp:1047 /dev/video13[18:out]: Unable to set format: Invalid argument

dmesg.log

davidplowman commented 1 year ago

I don't actually see any difference between the good/bad outputs, is that right? Though there's clearly trouble in the kernel log.

Can you say what is actually happening, is the FFmpeg process failing with this leaving things in a bad state? Does catching and ignoring the error make the problems go away?

tmihajloski commented 1 year ago

Sorry about that I updated the post with the correct logs. Catching and ignoring the error does not resolve problem. I added some more print statements in the script and it seems that stopping the encoder takes longer after which I get the error. I dumped the FFMPEG logs but I dont see any issues there.

davidplowman commented 1 year ago

Thanks for updating the log. It seems like it must be the V4L2 kernel crash that stops the camera from re-opening - it's not that anyone is still holding on to the camera.

I'm struggling at the moment to see how that could happen. If FFmpeg dies and that thread crashes, I can see that buffers don't get recycled properly and maybe V4L2 could get unhappy. Though a kernel crash seems worse than I would have expected. But if you catch and ignore the error - so the thread keeps running and the only difference is that it doesn't get sent out - how can it have a bad effect like that? Have you also fixed the code so that the stop() function won't try to do anything bad?

Out of interest, what is the error? Is it a broken pipe trying to write to ffmpeg.stdin? Are you able to tell if the kernel crash happens immediately, or only when you try to stop the system?

tmihajloski commented 1 year ago

I printed the script events in the syslog as well. Attached is the syslog snapshot where the crash occurs syslog.log.

It seems to happen after the encoder_stop() call.

I added the finally block originally to work around this issue, thinking that if something fails it would get the chance to stop and close everything gracefully.

Below is the code that I used for testing.

#!/usr/bin/python3

import socket
import time
from pprint import *
from picamera2 import Picamera2
from picamera2.encoders import H264Encoder
from picamera2.encoders import MJPEGEncoder
from picamera2.outputs import FfmpegOutput
from picamera2.outputs import FileOutput
from libcamera import controls
import logging
import logging.handlers

logger = logging.getLogger("CameraLogger")
logger.setLevel(logging.DEBUG)

log_handler = logging.handlers.SysLogHandler(address= '/dev/log')

logger.addHandler(log_handler)

logger.debug('Starting rtsp_camera.py')
picam2 = Picamera2()

try:
    video_config = picam2.create_video_configuration({"size": (1920, 1080)})

    logger.debug('Configuring the camera')
    picam2.configure(video_config)
    picam2.set_controls({"AfMode": controls.AfModeEnum.Continuous})
    encoder = H264Encoder(bitrate=800000, repeat=True, iperiod=15)

    #encoder = MJPEGEncoder()

    output = FfmpegOutput("-report -f rtsp -rtsp_transport tcp rtsp://localhost:8554/live.stream")

    #output = FfmpegOutput("-f mpegts udp://127.0.0.1:1234/")

    print ("start streaming")
    logger.debug('Starting the encoder')
    picam2.start_encoder(encoder=encoder, output=output)
    logger.debug('Starting the camera')
    picam2.start()
    logger.debug('Waiting to stream')
    time.sleep(120)
    logger.debug('Waiting done')

finally:
    print("stopping camera")
    logger.debug('stopping the camera')    
    picam2.stop()

    print("stopping encoder")
    logger.debug('stopping the encoder')
    picam2.stop_encoder()    

    print("closing camera")
    logger.debug('closing the camera')
    picam2.close()

    print("camera closed")
    logger.debug('rtsp_camera.py done!')
davidplowman commented 1 year ago

I'm still struggling a bit to understand how FFmpeg disappearing spontaneously can affect the rest of the code if you're catching and ignoring the error. When you do this, are you ensuring that the call to stop doesn't subsequently do anything, given that there's nothing to stop?

tvoverbeek commented 1 year ago

@davidplowman Did you look at the ayalog extract? There are messages about a crash in the video-driver:

Oct  4 14:31:22 raspi-zero kernel: [  207.816523] videobuf2_common: driver bug: stop_streaming operation is leaving buf 0000000005ae5823 in active state
Oct  4 14:31:22 raspi-zero kernel: [  207.816534] videobuf2_common: driver bug: stop_streaming operation is leaving buf 00000000eb3446cd in active state
Oct  4 14:31:22 raspi-zero kernel: [  207.816542] videobuf2_common: driver bug: stop_streaming operation is leaving buf 0000000099f8a9f8 in active state
Oct  4 14:31:22 raspi-zero kernel: [  207.816549] videobuf2_common: driver bug: stop_streaming operation is leaving buf 00000000fbbe929a in active state

The first run already leavea some buffers in the active state. The stack trace might also give some clues

davidplowman commented 1 year ago

Hi @tvoverbeek, thanks for replying!

Yes, I'm altogether a bit confused by the whole thing. It's the encoder which is being left in an unhappy state when that thread that dequeues its buffers crashes. It complains that its V4L2 queue hasn't been drained when it is stopped, which is fair enough, but that this should lead to a kernel crash seems a bit beyond the pail.

Even so, wrapping the failure to write to FFmpeg (because it seems to have died spontaneously?) really ought to make the error go away. The buffers will all get dequeued in the normal way by that thread, and this is all known to work because it's what we do all the time.

So why is it crashing in this case? Is it something else that is now misbehaving? I don't know. Maybe inserting some code to make the FfmpegOutput crash after (for example) 100 frames would make the error reproducible so that it could be investigated.

Someone else recently had a very similar problem, but for them catching the exception did seem to work, so I was intending to add that change. Yet I don't really see why this case would be different. Though things are also undergoing a bit of a Pi 5 / Bookworm upheaval at the moment, which just compounds the fun...

tmihajloski commented 1 year ago

Here is the ffmpeg log when It crashes. I have a 10s wait to stream, and I can see that ffmpeg stops streaming within a few seconds of starting. ffmpeg-20231005-091753.log