raspberrypi / rpicam-apps

BSD 2-Clause "Simplified" License
406 stars 219 forks source link

Images from libcamera-still occasionally show up with purple color added and then wash out #202

Closed knutegit closed 2 years ago

knutegit commented 2 years ago

I have a Pi3 taking photos with libcamera-still and sending them continuously to STDOUT. Once in a while I get a purple image followed by a washed out image and then it goes back to normal exposure. The other day it took a bunch of photos and got more and more washed out until the photo was just white and stayed that way until I restarted libcamera-still. I have recompiled libcamera, libepoxy and libcamera-apps on the Pi3 that is having the problems. Examples of the photos here: http://knutejohnson.com/libcamera.html

I just stopped the libcamera-still to get the version and the camera model and now it gives me this:

pi@camerapi:~/libcamera $ /usr/local/bin/libcamera-still ERROR: Invalid metering mode: Segmentation fault pi@camerapi:~/libcamera $ libcamera-still --version ERROR: unrecognised option '--version' Segmentation fault

I give up. I'm just going to go back to the legacy camera stack.

davidplowman commented 2 years ago

Hi, sorry to hear about these difficulties. As you know, we've not been having any luck trying to reproduce these problems. We discussed previously that if you were able to post an image online somewhere I'd be happy to give that a try, perhaps I should try a Pi 3B version as there might be less variation in terms of hardware. If you had time to do that we'd certainly pick that up and have another go, though we understand if it's not possible at the moment. Thanks!

knutegit commented 2 years ago

I can pretty consistently make it have the problem. The problem only happens on my Pi3B+ and I've tried it on two and they both show the problem. I posted a video of it at: http:/knutejohnson.com/libcamera.mp4. There is also the page with some stills that I captured with another larger program that shows a progression of deteriorating images at: http://knutejohnson.com/libcamera.html. The video was made with freshly compiled libcamera, libepoxy and libcamera-apps today.

davidplowman commented 2 years ago

Thanks for the update. I'd be interested to see the exif information for the images that go wrong and then perhaps the "ok" ones either side. If you're capturing DNGs, the exif for those would be good too.

I'll see if I have time today to fish out a Pi 3B and try it for myself on that. Can you just remind me of the exact command line you were using, also I recall it was the ov5647 (v1) camera, is that right? Can you also just repeat an estimate of then "mean number of captures to bad image". Thanks.

naushir commented 2 years ago

There is a possibility this is due to the CSI2 bus dropping some packets during transmission. Do you have any custom clock frequencies set in /boot/config.txt? Also, there had been a fix for a clocking bug that had gone in a month or two ago. Is your software and VC firmware up-to-date?

knutegit commented 2 years ago

To answer naushir's question: No changes to the clocks. libcamera, libepoxy and libcamera-apps were downloaded and compiled yesterday just before I posted. I had to do that because every time you do an apt full-upgrade libcamera-still seg faults and the only way I've been able to fix that is recompile it. Software is up to date. I don't know how to check firmware on a Pi3 but I assume it is up to date too.

For davidplowman:

libcamera-still --nopreview --thumb none --timeout 0 --timelapse 1 --width 640 --height 480 --quality 80 --output -

This is a rev 2.1 camera.

pi@raspberrypi:~/bin $ libcamera-still --list-cameras [0:08:14.785400192] [3654] INFO Camera camera_manager.cpp:293 libcamera v0.0.0+3381-1db1e31e [0:08:14.831654547] [3655] WARN RPI raspberrypi.cpp:1145 Mismatch between Unicam and CamHelper for embedded data usage! [0:08:14.832789994] [3655] INFO RPI raspberrypi.cpp:1256 Registered camera /base/soc/i2c0mux/i2c@1/imx219@10 to Unicam device /dev/media3 and ISP device /dev/media0 Available cameras

0 : imx219 [3280x2464] (/base/soc/i2c0mux/i2c@1/imx219@10) Modes: 'SBGGR10_CSI2P' : 640x480 1640x1232 1920x1080 3280x2464 'SBGGR8' : 640x480 1640x1232 1920x1080 3280x2464

image77 image78 image79

Image 78 must have some issues because it shows up on the desktop with an icon instead of an image. I'm attaching the code I used to create these images. I've been running it with Java 11 from the repository.

Camera.zip

Just so you know, this is only with libcamera, the legacy camera stack works just fine and I've been running a Java motion detection program with basically this same code for over a year with rare problems.

naushir commented 2 years ago

I don't know how to check firmware on a Pi3 but I assume it is up to date too.

You can get the firmware version by running vcgencmd version

naushir commented 2 years ago

It would be useful to also know if you can see the problem if you add the following:

core_freq_min=350

to /boot/config.txt and rerun your program.

knutegit commented 2 years ago

pi@raspberrypi:~/bin $ vcgencmd version Dec 22 2021 14:25:43 Copyright (c) 2012 Broadcom version 720889ee7c970afe516868d20515a73892f9c127 (clean) (release) (start)

I just did an apt update, apt full-upgrade and now get:

pi@raspberrypi:~/bin $ libcamera-still --list-cameras ERROR: unrecognised option '--list-cameras' Segmentation fault

So I'll report on the core_freq=350 when I get it running again.

naushir commented 2 years ago

pi@raspberrypi:~/bin $ vcgencmd version Dec 22 2021 14:25:43 Copyright (c) 2012 Broadcom version 720889ee7c970afe516868d20515a73892f9c127 (clean) (release) (start)

This firmware version should have the clock fixes, so it is likely that the core_freq_min=350 would not do anything to fix your problem. Still it would be worth trying...

davidplowman commented 2 years ago

I've tried running this on a Pi 3B and have discovered a couple of things. For the record, here are the version numbers:

pi@raspberrypi:~ $ uname -a
Linux raspberrypi 5.10.89-v7+ #1508 SMP Tue Jan 4 19:51:16 GMT 2022 armv7l GNU/Linux
pi@raspberrypi:~ $ vcgencmd version
Jan  4 2022 18:10:52 
Copyright (c) 2012 Broadcom
version 89012f19c33b91cbe9df3933e2a21375bb4f0274 (clean) (release) (start)
pi@raspberrypi:~ $ libcamera-still --version
libcamera-apps build: bb59fb4e0e1c-intree 05-01-2022 (10:23:41)
libcamera build: v0.0.0+3381-1db1e31e

Firstly, the segfaults are to do with the runtime linker. Once you've built your own executables and libraries, which libcamera-still should report /usr/local/bin/libcamera-still and ldd /usr/local/bin/libcamera-still should give you a long list of stuff, but on the 3rd line I have

    libcamera_app.so => /usr/local/lib/libcamera_app.so (0x76f27000)

That's all good. But what I discovered is that after I installed another package (sudo apt install exiftool in my case) the linked library changed away from /usr/local/lib to /usr/lib/arm-linux-gnueabifh. That is, the runtime linker was now loading the originally installed library, not the locally built one. They're not binary compatible so it crashes. Moreover, this seems to happen with any package installation.

This behaviour completely surprises me, I have no idea why it does that. In the short term it can be fixed by doing sudo ldconfig /usr/local/lib again, but I've never run into this before, or perhaps it's just the way I work and build stuff. But it certainly seems like there's something to investigate or at least understand there.

Secondly, I have managed to see similar image corruption. In my case I set it going and all the images are fine until the 793rd which is all blue/magenta. Thereafter I seemed to get one every 30 to 100 images. I can't see that there's any different processing been done to the image, they totally look like problems of some sort on the bus, as though the first row of pixels is getting lost which flips the Bayer pattern giving rise to just this sort of effect.

I've managed to get a "good" and "bad" raw file too, and they're obviously very different, which certainly adds weight to that theory.

naushir commented 2 years ago

@davidplowman since you can reproduce this, would you be able to try core_freq_min=350 (or even higher) in your config.txt file?

Edit: Perhaps force_turbo=1 might also be worth trying instead.

davidplowman commented 2 years ago

My Pi 3B fails to boot with force_turbo=1. I tried core_freq_min=500 but I still get some bad pictures.

knutegit commented 2 years ago

davidplowman:

Sorry I replied directly to naushir when he asked me to try the core_freq_min=350. It still generates bad images. I'll try some other options today and report back.

davidplowman commented 2 years ago

Going back to the subject of the segfaults for a moment, I think we have a proper fix for it, explained here: https://forums.raspberrypi.com/viewtopic.php?p=1958181#p1958181

knutegit commented 2 years ago

I tried force_turbo=1 and mine won't boot either. I tried a general overclock that I found here:

https://bennysbitsandbytes.wordpress.com/2020/08/28/overclocking-the-raspberry-pi-3-b/

arm_freq=1500 gpu_freq=600 over_voltage=6 temp_limit=80 sdram_freq=600 sdram_schmoo=0x02000020 over_voltage_sdram_p=6 over_voltage_sdram_i=4 over_voltage_sdram_c=4 total_mem=1023 hdmi_drive=2

That made no difference.

knutegit commented 2 years ago

I tried the same SD card on a Pi 0W 2 with mixed results because it just doesn't have enough umpf to run Chromium and anything else but I did get one purple image. I recompiled libcamera-apps on a Pi2B on the same SD card and it has the same purple image problem. None of the Pi4s ever have purple images.

But I discovered something interesting, I only have the purple images with the version 2.1 camera. The version 1.3 camera does not show purple images.

On another more confusing note, the Pi4 1.4 that would hang on the continuous output to stdout doesn't hang with the 2.1 camera. This makes no sense.

Thanks for the note on the cmake. That solves another confusing problem.

naushir commented 2 years ago

But I discovered something interesting, I only have the purple images with the version 2.1 camera. The version 1.3 camera does not show purple images.

This sort of backs the theory of dropped CSI2 packets causing the glitch. The v1 camera uses a lower amount of throughput so does not drop packets on the bus. Now to understand why this is happening...

knutegit commented 2 years ago

I keep thinking that it has to have something to do with the processor changing speed, at least that's what it looks like when I can make it go purple. So I tried playing again with the force_turbo=1 option and found that I cannot boot the Pi3 with a monitor hooked up to the HDMI port with forece_turbo=1. I can however get it to boot if there is no monitor plugged in. I tried that and couldn't get it to give me a purple image with turbo on watching it with VNC but I can without the turbo. So I've got it running now with my motion detection program that usually shows a purple image a couple of times a day. I'll report back on this in a day or two.

knutegit commented 2 years ago

Unlike the pictures from the small program above my motion detection program would throw a purple image and then several more lighter and lighter images until they were completely white. I would then restart my program and the images would be good again for a while. After setting force_turbo=1 on my Pi yesterday it took good images until a few minutes ago. When I looked the image was severely washed out. I stopped the program and restarted it and the image is good again. Screen shots below. The fact that I didn't get a rapid change that caused the images to be captured leads me to believe that there wasn't a purple image and that the purple image and the fading image may be two different manifestations of a problem or problems.

2022-01-08-145305_1920x1080_scrot 2022-01-08-145359_1920x1080_scrot

This webpage shows sample images from a while back of the purple and then fade:

http://knutejohnson.com/libcamera.html

knutegit commented 2 years ago

I got another purple image this morning but it went back to normal right after so I've left the program running. There was no fade this time. That is an improvement over running without force_turbo.

20220109080623834 20220109080624747

On another note, it appears that there isn't the same range of auto exposure that there was in raspistill. In bright sun the photos look a little lighter and in lower light they look a little darker.

naushir commented 2 years ago

Thanks for all the analysis!

I've got a setup that reproduces this fairly easily now as well. Unfortunately, I still have not been able to come up with a solution yet, but I do have a few ideas that I will be trying out this week.

naushir commented 2 years ago

It seems that force_turbo=1 gets rid of the pink frames for me as well. I do not see the "washed out" image, but that may be a separate AGC problem.

This sort of implies that clock changes are occurring while our camera is operating, and causing glitches on the CSI2 bus.

knutegit commented 2 years ago

I left my motion detection program running over the weekend until today and I'm still getting purple images, purple images followed by washed out images and sometime just bad exposures. I did have trouble causing it to make a purple image with force_turbo=1 but it hasn't eliminated the problem. Not sure if there is anything left for me to experiment on so I'm going to put my motion detection program back on the legacy camera stack for now.

naushir commented 2 years ago

@knutegit thanks for providing all the debug info for this issue. I am becoming more convinced that it is a clock change in the system that causes the pink frame, but which clock and why it only occurs with a Pi3B + imx219 is still a mystery.

I have a small delay (100ms) added in the unicam kernel driver when we switch on clock, before streaming on the sensor. This seems to fix my issue, but I need to do much more testing before conclusively saying this is a fix.

naushir commented 2 years ago

After a few weeks of head-scratching, I think I may have a solution for this in https://github.com/raspberrypi/linux/pull/4834. Still unclear of the exact mechanics of what is going wrong, but with that change, I cannot seem to reproduce the pink frame over many thousands of images.

knutegit commented 2 years ago

So this is in the kernel? When will it be in rpi-update?

naushir commented 2 years ago

Yes, it's in the kernel driver. @popcornmix can you merge to rpi-update when you get a chance?

knutegit commented 2 years ago

Has this made it to rpi-update yet?

naushir commented 2 years ago

Not yet unfortunately.

@popcornmix can you merge https://github.com/raspberrypi/linux/commit/59aeb16c7f1254f1383476956dda0766d10c918a into rpi-update when you get a chance please?

naushir commented 2 years ago

The fix is now available in rpi-update. @knutegit can you update and let me know if this issue is resolved for you please?

knutegit commented 2 years ago

I ran an rpi-update this morning and haven't seen a pink image yet. The program is running, I'll post again in the morning if I see any.

naushir commented 2 years ago

Great! I'll resolve this issue for now. Please re-open if you see any of these problems again.

knutegit commented 2 years ago

I left it running all yesterday afternoon and last night. I captured no pink images but the image was faded to all white this morning. This goes along with the darkened images that I think now are related to the white images. I'll post another bug when I can make it fail.

knutegit commented 2 years ago

There was a kernel update a couple of days ago and the purple images are back.

Linux camerapi 5.10.103-v7+ #1530 SMP Tue Mar 8 13:02:44 GMT 2022 armv7l

The fixes didn't get in this new kernel? Do I have to go back to the older kernel? Still running the code I compiled in late January. Should I compile again?

naushir commented 2 years ago

Our latest kernel obtained with an rpi-update is on version 5.15. I've checked and that does have the fix. Is this a custom kernel obtained from elsewhere? If so, would you be able to try the 5.15 kernel through rpi-update. Please do backup your sdcard first before updating - just in case!

knutegit commented 2 years ago

Stock kernel from apt upgrade. It was working fine until the new kernel and thought there had been other kernel updates in the last month but I could be wrong about that. I did an rpi-update, so far no purple images, I'll keep you posted on that.

Is the timelapse fix in the repository now? If it is I should be able to use everything from there instead of compiling my own?

naushir commented 2 years ago

Is the timelapse fix in the repository now? If it is I should be able to use everything from there instead of compiling my own?

Yes, I think all fixes should now be in the apt repo now.

knutegit commented 2 years ago

Running rpi-update to get the new kernel solved this problem again. When is this fix going to get into the regular release kernel?

naushir commented 2 years ago

I'm surprised it's not in already. Will make sure it gets in as soon as the next release takes place.