xamarin / XamarinCommunityToolkit

The Xamarin Community Toolkit is a collection of Animations, Behaviors, Converters, and Effects for mobile development with Xamarin.Forms. It simplifies and demonstrates common developer tasks building iOS, Android, and UWP apps with Xamarin.Forms.
MIT License
1.58k stars 471 forks source link

[Enhancement] [CameraView] Allow access to the live camera feed in CameraView #1096

Closed mavispuford closed 3 years ago

mavispuford commented 3 years ago

Summary

With on-device machine learning libraries like Google's ML Kit becoming popular, it's important that we have a way to access the live camera feed so that image processing can happen in real time.

API Changes

Shared Code Changes:

This would require that a new CameraCaptureMode value be added:

public enum CameraCaptureMode
{
    Default,
    Photo,
    Video,
    Preview // New capture mode for just showing the camera preview
}

Android Changes:

NOTE: For Android, this is no longer the proposed solution as it is dependent on the screen resolution of the device, which doesn't really make sense (especially for low-res devices). See the discussion in the comments. I'll update this issue if the new solution works out in testing.

CameraFragment.android.cs

When the CameraCaptureMode is set to Photo, CameraTemplate.Preview is used in CameraFragment. That would likely have to change to CamaraTemplate.StillCapture. This seems more appropriate anyway, according to Google's documentation (it prioritizes image quality over frame rate). Then we could use CameraTemplate.Preview when CameraCaptureMode is set to Preview.

If the CameraCaptureMode is set to Preview, this could be checked in TextureView.ISurfaceTextureListener.OnSurfaceTextureUpdated(SurfaceTexture surface). Then a call could be made to an implementation of the following interface (which would be registered in DependencyService by a consumer of XCT):

public interface ICameraPreviewProcessor
{
    Task Process(Bitmap bitmap, int rotationDegrees);
}

What OnSurfaceTextureUpdated() might look like:

async void TextureView.ISurfaceTextureListener.OnSurfaceTextureUpdated(SurfaceTexture surface)
{
    if (cameraTemplate == CameraTemplate.Preview && cameraPreviewProcessor != null)
    {
        // This using is needed to ensure that the Bitmap is garbage collected
        using var bitmap = texture?.Bitmap;
        if (bitmap == null)
        {
            return;
        }

        await cameraPreviewProcessor.Process(bitmap, GetDisplayRotationDegrees());
    }
}

Also, in CameraFragment.PrepareSession(), we would make sure that only the previewSurface is added to the capture session.

iOS Changes:

FormsCameraView.ios.cs

In InitializeCamera(), we can add an AVCaptureVideoDataOutput output to captureSession with a custom sample buffer delegate whose sole purpose is to take the sample buffer and pass it along to the registered iOS implementation of ICameraPreviewProcessor (same name as the Android variant, but different interface):

public interface ICameraPreviewProcessor
{
    Task Process(CMSampleBuffer sampleBuffer, AVCaptureVideoOrientation orientation);
}

Windows Changes:

I admittedly haven't looked into Windows yet. I'm hoping it's not much different from the other two platforms.

Intended Use Case

The consumer of Xamarin Community Toolkit would implement the ICameraPreviewProcessor interfaces for each platform to be able to hook into the live camera feed. Each frame could then be passed on to other libraries for processing.

Who Will Do The Work?

MattiaDurli commented 3 years ago

I tested your fork in my android app instead of the main XCT, and made it work using the CameraPreviewProcessor.

I found that it works as expected in CaptureMode="Preview" mode, but if I want to use the standard CaptureMode="Photo", it displays just black. You can reproduce it in the standard XTC Sample app.

The bitmap preview is 960*720, is there a setting for an higher resolution?

Thanks

mavispuford commented 3 years ago

Hi @MattiaDurli, I can try to take a look at the black photo thing when I get some time in the next week (I just tested it real quick in the XCT Sample app and it works fine on my Pixel 4 XL).

Regarding the resolution, this is just pulling from the live camera preview surface that is rendered on screen. The main purpose of this mode would be for on-device live camera processing with ML Kit etc. For text recognition, Google's input image guidelines recommend a 640x480 image for business card type documents and 720x1280 for a letter-sized piece of paper. In the test app I created for this, I was able to get input images of 1570*2023 on my Pixel 4 XL which was more than enough for my use case.

I suppose we could look into pulling the preview camera feed from something else to get a higher resolution image, but I'd have to look at what the other options are. I'm definitely not an expert in this area, so if anybody has some suggestions, I'd love to have them. I just couldn't find any out of the box options out there so I figured I'd try modifying the XCT CameraView to suit my needs.

One thing to consider for these on-device ML libraries is that performance is important and if your input image is too big, your performance will drop.

mavispuford commented 3 years ago

Looking into it more, it seems like the more appropriate solution for Android might be to add another Surface/ImageReader to read in the frames that way. Then we could set our own width/height etc.

MattiaDurli commented 3 years ago

Hi @MattiaDurli, I can try to take a look at the black photo thing when I get some time in the next week (I just tested it real quick in the XCT Sample app and it works fine on my Pixel 4 XL).

Hello @mavispuford , the issue, when CaptureMode="Photo", is that the onscreen preview is black, I havent't tried to take a photo. I'll do some other test tomorrow, with my Pixel 3a using the XTC Sample for reference.

Regarding the resolution, this is just pulling from the live camera preview surface that is rendered on screen. The main purpose of this mode would be for on-device live camera processing with ML Kit etc. For text recognition, Google's input image guidelines recommend a 640x480 image for business card type documents and 720x1280 for a letter-sized piece of paper. In the test app I created for this, I was able to get input images of 1570*2023 on my Pixel 4 XL which was more than enough for my use case.

For now what I'm trying to do is to evaluate the preview frame bitmap to be sure that the text/barcode is focused. Only when in focus the user will be allowed to take a photo. So it's ok for the preview frame to be low res, for my evaluation, but I was looking for something a little bit more than the 720x960 I'm getting. I've tried the sample on a Zebra TC21, an industrial device with a screen resolution of 720x1280 (so same number of rows of the frame I get). I'm wondering if on such a device is possible to get a higher resolution preview frame, or if the preview frame is always the same of the screen resolution.

I suppose we could look into pulling the preview camera feed from something else to get a higher resolution image, but I'd have to look at what the other options are. I'm definitely not an expert in this area, so if anybody has some suggestions, I'd love to have them. I just couldn't find any out of the box options out there so I figured I'd try modifying the XCT CameraView to suit my needs.

Same here, trying to find a solution studying XTC source and your fork.

One thing to consider for these on-device ML libraries is that performance is important and if your input image is too big, your performance will drop.

Of course, I'm trying to get from 720 to 1080, then it would be enough.

I'll do some tests also to verify if it is possible to switch from preview to photo mode, or be able to process the frames, but also take the high res photo when the shutter event is called.

mavispuford commented 3 years ago

@MattiaDurli - I've created a new branch that approaches this differently for Android. Here's the commit so you can see what changed.

Instead of piggybacking off of the on-screen preview texture in OnSurfaceTextureUpdated(), I'm using a separate ImageReader with its own IOnImageAvailableListener implementation. This seems to work in my initial testing, but I haven't had the chance to try consuming it on the other side of the ICameraPreviewProcessor. This method allows us to set our own dimensions for the image reader, though, so that should help you with your resolution problem.

Let me know what you think.

Ps. It's still using the same sizing as the preview for now, so if you want to tweak that, just change the ImageReader width/height in SetupPreviewImageReader().

MattiaDurli commented 3 years ago

@mavispuford - First of all thanks because with some mods to your previous branch I've been able to accomplish what I needed. I can process the livefeed frames (Android.Graphics.Bitmap) in real time, and when process is ok I can take the high res picture.

I now updated eveything to your latest branch, and had to just move the initializer of _cameraPreviewProcessor to the second constructor, because the first wasn't called.

public OnPreviewImageAvailableListener()
        {
            _cameraPreviewProcessor = DependencyService.Get<ICameraPreviewProcessor>();
        }

        public OnPreviewImageAvailableListener(Context context)
            : base()
        {
            _context = context;
            _cameraPreviewProcessor = DependencyService.Get<ICameraPreviewProcessor>();
        }

The Android.Media.Image that is captured is now at an higher resolution, but I'm stuck at the conversion of the Android.Media.Image to Android.Graphics.Bitmap (in the previous branch ProcessCamera produced a Bitmap). Any hint on how to do it?

Also, the Photo mode of the CameraView produces a byte[] that is a JPG. Before sending to the server, I want to crop it so I have to decode, crop and reencode. I see that I can set the ImageFormat to ImageFormatType.Yuv420888 for the StillPicture instead of ImageFormatType.Jpeg, like you did for the preview, but then what kind of data would I get? an Image? a Bitmap?

So far, for me it would be great to be able to get a Android.Graphics.Bitmap for both frame and photo.

Also, there's a lot of Camera2API features not exposed by the CameraView of the XTC, it's a pity.

I'll experiment more and let you know.

mavispuford commented 3 years ago

@MattiaDurli - It's really good to hear that you were able to get that piece working. And good call on the initializer thing, I wasn't able to test an implementation of ICameraPreviewProcessor yet. I think I meant to put : this() in that second constructor instead of : base(). Oops.

The Android.Media.Image that is captured is now at an higher resolution, but I'm stuck at the conversion of the Android.Media.Image to Android.Graphics.Bitmap (in the previous branch ProcessCamera produced a Bitmap). Any hint on how to do it?

I'm not quite sure, actually. For my use case, I've been able to pass either a Bitmap OR an Image off to ML Kit for processing without having to do any conversions. Searching around, it seems like you have to do a YUV to RGB conversion. This Stack Overflow answer might help. They link to a tensorflow project that seems to do what you need (it's Java code, though).

The reason I changed the Process() signature in ICameraPreviewProcessor is because that's how the frames come out of the ImageReader and I figured I'd let the consumer do any necessary conversions.

mavispuford commented 3 years ago

I updated the LiveCameraFeed-Surface branch. CameraView now has a PreviewFrameQuality bindable property. This allows you to choose between Quality and Performance from the Xaml side. I'm not quite sure what the different resolutions for those should be, but for now, Performance sets the minimum resolution to 600x800 and Quality mode sets the minimum to 720x1280. Also for now, it's Android-only but I plan to use that in iOS etc. too.

MattiaDurli commented 3 years ago

Thanks @mavispuford but I'm still stuck with Android.Media.Image instead of Android. Graphics.Bitmap as result of CameraProcess. It can't be that hard, there must be something I'm obviusly missing. With Bitmap, I was able to process the frame with Emgu/OpenCV, to save it to disk, to send it to server as bytearray/jpg/png. Can't find a way to process the Image. I tried to convert Image to Bitmap with various stackoverflow suggestions but the final result is garbage. But still, Image and Bitmap are different representation of a picture, on the same system, there must be an easy way to convert them.

mavispuford commented 3 years ago

@MattiaDurli Sorry to hear you're still having issues with the format conversion. Have you come across this article in your research? It looks pretty well put together (including benchmarks for each solution). It's crazy that Google doesn't have any convenient APIs for doing this conversion.

MattiaDurli commented 3 years ago

Hi @mavispuford yes I've already been thorugh that article, converted everything to C#, but resulting image is garbage. I've posted here https://stackoverflow.com/questions/66968240/convert-from-android-media-image-to-android-graphics-bitmap-possibily-with-xama to ask for help, because all other questions on StackOverflowd doesn't seem to work, I tried them all.

I see that you set Yuv420888 for the new reader, would it be possible to set it to another format to be able to retrieve a Bitmap from CameraProcess?

previewReader = ImageReader.NewInstance(previewSize.Width, previewSize.Height, ImageFormatType.Yuv420888, 1);

jfversluis commented 3 years ago

Yes! I would love to have this and looks like you already have a great piece of the work done. Love that :)

I'll let some other team members chime in a little on this. Since you already have something working, you might put it in a PR targeting develop so we can test it for a bit too. Only if that is not too much trouble

pictos commented 3 years ago

@mavispuford do you think if we use the CameraX this will be easier to implement/handle/maintain?

mavispuford commented 3 years ago

@MattiaDurli - The main reason I chose Yuv420888 is because Google says that's the most efficient format (see the image parameter under fromMediaImage()) to use for ML Kit. I suppose we could make it configurable somehow. We definitely don't want to make this feature only work for a single use case.

@jfversluis Good to hear! I'll try to get a PR out for review, though it still needs a bit of refinement (I noticed a couple things in my testing that I want to confirm).

@pictos It'd probably be better in the long run to use CameraX. Performance-wise, Google says Camera (not Camera2 or CameraX) is the best API to use. Since we're already using Camera2 here, that performance ship has already sailed.

MattiaDurli commented 3 years ago

@mavispuford I searched through that same doc and all other sources says that Yuv420888 is the way to go for both performance and reliability (other formats are not guaranteed to be supported on all cameras). So that's ok. Probably MLKit takes care of all the conversion stuff, but there definetely should be a helper to convert the YUV to a easier to manage format, for other processing libraries.

mavispuford commented 3 years ago

@jfversluis, @pictos, @MattiaDurli

Alrighty, I'm kind of in a dilemma here. As mentioned above, I went down the path of using an ImageReader/IOnImageAvailableListener combo which is much better than my first approach. However, in my testing with ML Kit, it was having a ton of performance issues (lots of UI jank as soon as ML Kit would process anything). I tried several things (different threading approaches, adjustments to input image size, etc.) and they all had pretty terrible performance. We're talking frame rate drops that would freeze the camera feed in the UI for 800-1400 ms depending on the size of the image. One weird thing to point out, though, is that the jank only happened in the CameraView and not on any Xamarin Forms UI in my experience.

After hitting that wall, I decided to try moving over to CameraX which actually solved all my performance issues. No frame drops at all during processing, the camera feed is buttery smooth. However, as you might know, the CameraX libraries are still in beta ("They are ready for production use but may contain bugs."). On top of that, they only seem to officially support photo capture at the moment. There are APIs in CameraX for video (see blog post here) but they are restricted and require you to silence the compiler with a @SuppressLint("RestrictedApi") annotation.

The interesting thing is CameraX still leverages Camera2 behind the scenes. It's basically a layer of abstraction on top of Camera2 that handles lifecycle events and other things for you. But they must be doing something different (Java threading magic?) that makes the performance so much better when using their IAnalyzer pattern.

So here are my options:

pictos commented 3 years ago

@mavispuford your CameraX implementation is even with our Camera2 implementation? If so, you can send a PR using CameraX. If not we need to first migrate our code to use CameraX and accept your PR right after. Even in beta we think that CameraX could bring more value to the control

MattiaDurli commented 3 years ago

@mavispuford some updates on my part: my primary issue was how to deal with the YUV420888 image produced by Camera2. It can be passed to MLKit, but if you want to use it with other libraries like OpenCV or just save it as Bitmap, you have to convert it, and all the examples I found, even Emgu conversion, were producing garbage. Then I changed camera preview resolution, and it worked. The answer is that while the Camera2 API on different devices reports dozen of supported preview sizes available to choose from, not all of them work (is widely reported). I started with 1440x1080 and it was not working on my device. Then I tried on another one and it worked. Then I went on with trial and error and found that some work and some not. 640x480 or 800x600 work on all the 4 devices I tested. It doesn't depend on aspect ratio. On the one I need to develop on, 1600x1200 work, but not 1440x1080.

So I continued to develop my solution with your Camera2 implementation, using Emgu as processing library (it processes every frame, by cropping it, applyng a convolution to detect if is blurry or not) and I get a 28fps with 1200x960 and 23 with 1600x2000, consistently, that is great for me. I haven't experienced any of your camera feed problem, for me your implementation of Camera2 API works very well.

While I was trying to solve my YUV conversion problem, I looked through CameraX documentation, but it seemed "very" beta to me, and missing some features.

What I'm still trying to do, with your implementation, is to change focus type. I need to read at a few cm (just like barcodes) and it takes some time to get the focus. Using CONTROL_AF_MODE_MACRO should help, but it's quite tricky to implement. Unlike CONTROL_AF_MODE_CONTINUOUS_PICTURE that you set at the beginning, wih macro you have to send AF_TRIGGER requests continuosly and check the status. I wonder if CameraX helps with that.

pictos commented 3 years ago

@qz2rg4 do you have something to add to this discussion?

mavispuford commented 3 years ago

@pictos My CameraX implementation doesn't have feature parity with the current Camera2 code since I only set up image analysis, but image capture could easily be added. The biggest problem would be video capture, as that's not an official feature of CameraX at the moment. I'm not even sure if those classes are fully exposed through the bindings.

@MattiaDurli Interesting find. It's possible that CameraX would help with the resolution problem. You can specify the target resolution and it finds the closest match that works with the device. I'm not sure about the focus issue, though. I see that you can call StartFocusAndMetering() but it looks like it only accepts a point and doesn't let you control the focus distance:

var point = _previewView.MeteringPointFactory.CreatePoint(xPos, yPos);
var action = new FocusMeteringAction.Builder(point)
                        .DisableAutoCancel()
                        .Build();
camera.CameraControl.StartFocusAndMetering(action);
brminnick commented 3 years ago

Thanks! However, we are no longer adding new features to Xamarin Community Toolkit, focusing on the .NET MAUI Community Toolkit.

Please open a New Feature Discussion to implement this feature in the .NET MAUI Community Toolkit.

I've posted more information about the Future Of Xamarin Community Toolkit here: https://devblogs.microsoft.com/xamarin/the-future-of-xamarin-community-toolkit/?WT.mc_id=mobile-0000-bramin