lemariva / micropython-camera-driver

add camera support to MicroPython
https://lemariva.com
Apache License 2.0
462 stars 99 forks source link

Want to join forces for new camera API? #99

Open cnadler86 opened 1 month ago

cnadler86 commented 1 month ago

Hi there, first, thank you for providing a good first start to the micropython camera driver, it helped me starting to build an improved version with the goal of supporting also other ports (e.g. RP) and maybe integrate the API in micropythons source code in the future.

I believe that working on open source with the support of a bride community can achieve incredible things, because of the wide expertise than a community can provide. Therefore, if you are interested on contributing to the new API, let me know. Or if you prefer to reference to the new API in you repo, I will very appreciate it.

kdschlosser commented 6 days ago

Take a look at what I have done with making a common API for LCD displays.

Cameras are much like LCD displays where you have a bus and then you have the camera. The bus is the type of connection to the camera. The bus drivers I have written in C code

https://github.com/lvgl-micropython/lvgl_micropython/tree/main/ext_mod/lcd_bus

and the display drivers are written in Python.

https://github.com/lvgl-micropython/lvgl_micropython/tree/main/api_drivers/common_api_drivers/display

I believe that the same kind of a design would work well for cameras.

cnadler86 commented 6 days ago

I quickly read the documentation and in deed, it is an interesting approach.

I will wait until another port comes in play and then see, how to integrate it. At the moment I used the already available esp32 camera driver (they made here something like you, just in c). And the problem is, I don't have the capacity at the moment to write or integrate a generic driver. 🙃

kdschlosser commented 6 days ago

I just started dinking about writing a driver much like this one but it exposes more of the available feature in the esp32-camera component and will have a class that is able to have a reference held for it. This way it can be properly cleaned up and release any used resources in the event it goes out of scope and gets GC'd

cnadler86 commented 5 days ago

You mean something like this? This would be at least a subset.

I am sure the right way of building a general Camera API would be abstracting the components in reusable once (as far as possible) across the different port in order to have as less as possible redundancies. I assume this is more or less what you are thinking about?

My actual approach is the less effort approach, let's say the quick and dirty: taking pre-builded blocks and integrate them to Micropython with a wrapper. For this I only need a reusable API at highest level and a defined interface for the wrappers. But I relay on the availability of the drivers at lower level. That's why I only have esp32 support at the moment.

But assuming a general, target agnostic driver is out there, then the life would be much easier 🙃

And in the GC: I don't see a practical use case where you really need a GC on the camera besides playing around with it. Normally, you only have one camera object and you don't create another because the last one was thrown away. You can have one and configure it with its properties or methods and if really needed, you can manually free the resources by calling a method or binding the free-resources-method to the delete event (That's how I handle the resources).

Let me know if your plans get more concrete, then perhaps we can switch the discussion to my repo (or another channel) and not spam our friend here with our thoughts ðŸĪŠ

kdschlosser commented 5 days ago

I have been looking at the code for esp32-camera component for the ESP32 and it should have been written in a similar fashion to the displays. However, it wasn't and that is really limiting. The good news is that I can write a new driver for MicroPython that will separate the code that is specific to a camera form the code that is for the connection between the camera and the ESP32. The benefit of going this is the camera specific code would be a runtime choice and not a compile time like it is currently. That doesn't mean that all of the camera code will always be loaded onto the ESP32. That is not the case with what I plan on doing. The camera code will be located in Python so you only upload the drivers you plan on using. The bus drivers will be in C code, much like the display binding I wrote for MicroPython.

I am going to do some more digging into the drivers for the cameras. If they are using an I8080 parallel interface the camera might be able to share the same I8080 bus with a display. I do know that the I8080 spec does have a CS line to select the device to use I just don't know if cameras will support it or not.

On ESP32's that have 2 cores I am going to set up the drivers so that most of the work performed will be done on core 0. Typically core 1 is used for application code and core 0 is used for IO related code. MicroPython (application) runs on core 1 on a dual core ESP32. This will help to speed things up.

MicroPython is written to run as a single task (thread) and if I want to have code run I need to add it to the micropython scheduler. The scheduler runs when the main task has the free time to do it. When a frame is ready I will schedule a callback to be called. Should a new frame be received before the main task gets the time to collect the previous frame buffer it will keep track of those missed frames. This will help the user to dial in their code and camera settings for optimum performance.

This will be a more involved thing to do because of needing to write a completely new driver so it only handles the bus aspects of things. The Python side of things will be pretty easy to do because all that is going to do is pass memoryviews of data to the bus driver to be transmitted and to decode setting data that is received from the display. The decoding of frame buffer data will need to be done in C code for best performance.

The frame buffers will be memory view objects in python. The buffers themselves will be allocated in C code to allow for the greatest flexibility so the user will be able to pick what kind of memory they want to store the frame buffers in. This once again will allow for the best possible performance.

There are all kinds of things that are going to be able to be tweaked.

When done writing new drivers will be a snap to do. The reason why is there will be a framework in place to handle all of the common tasks. This offers the greatest flexibility when written in python. With the use of subclassing you can override the default behavior of the framework which allows for customization by the user but it also allows for the drivers to control how things are done and they are able to make adjustments depending on the needs of the camera.

kdschlosser commented 5 days ago

You mean something like this?

No I don't, because that is a compile time driver. It's not best suited to work with MicroPython which is a runtime programming language. A user should not have to pick what hardware they are going to run before compiling the firmware. They should be able to pick the driver when the ESP32 starts up and runs MicroPython. Hardware related connections would be written in C code but the driver for a display/camera would be in Python and that Pytthon code is able to interact with the hardware connection driver to pass data and to receive data. The trick is organizing that data in a way that makes it work in Python code.

kdschlosser commented 5 days ago

and I also want to mention that component you linked to only supports the P4 which I have not been able to locate where to buy one.

cnadler86 commented 4 days ago

I have been looking at the code for esp32-camera component for the ESP32 and it should have been written in a similar fashion to the displays. However, it wasn't and that is really limiting. The good news is that I can write a new driver for MicroPython that will separate the code that is specific to a camera form the code that is for the connection between the camera and the ESP32. The benefit of going this is the camera specific code would be a runtime choice and not a compile time like it is currently. That doesn't mean that all of the camera code will always be loaded onto the ESP32. That is not the case with what I plan on doing. The camera code will be located in Python so you only upload the drivers you plan on using. The bus drivers will be in C code, much like the display binding I wrote for MicroPython.

I am going to do some more digging into the drivers for the cameras. If they are using an I8080 parallel interface the camera might be able to share the same I8080 bus with a display. I do know that the I8080 spec does have a CS line to select the device to use I just don't know if cameras will support it or not.

On ESP32's that have 2 cores I am going to set up the drivers so that most of the work performed will be done on core 0. Typically core 1 is used for application code and core 0 is used for IO related code. MicroPython (application) runs on core 1 on a dual core ESP32. This will help to speed things up.

MicroPython is written to run as a single task (thread) and if I want to have code run I need to add it to the micropython scheduler. The scheduler runs when the main task has the free time to do it. When a frame is ready I will schedule a callback to be called. Should a new frame be received before the main task gets the time to collect the previous frame buffer it will keep track of those missed frames. This will help the user to dial in their code and camera settings for optimum performance.

This will be a more involved thing to do because of needing to write a completely new driver so it only handles the bus aspects of things. The Python side of things will be pretty easy to do because all that is going to do is pass memoryviews of data to the bus driver to be transmitted and to decode setting data that is received from the display. The decoding of frame buffer data will need to be done in C code for best performance.

The frame buffers will be memory view objects in python. The buffers themselves will be allocated in C code to allow for the greatest flexibility so the user will be able to pick what kind of memory they want to store the frame buffers in. This once again will allow for the best possible performance.

There are all kinds of things that are going to be able to be tweaked.

When done writing new drivers will be a snap to do. The reason why is there will be a framework in place to handle all of the common tasks. This offers the greatest flexibility when written in python. With the use of subclassing you can override the default behavior of the framework which allows for customization by the user but it also allows for the drivers to control how things are done and they are able to make adjustments depending on the needs of the camera.

This is Great!!! The plan seams straight forward. About the camera bus, I don't know either.

The only idea that came into my mind here will be to consider using a class factory instead of only subclassing, because most of the sensors are similar, but have small differences. For example a ov2640 don't have denoise and another setting I don't remember, but the ov5640 does, saturation levels are little different on both sensors, meaning the min and max are different, each sensor only supports specific frame sizes and Pixelformats, etc. Having a class factory would give the ability to customize the driver for the specific sensor within defined boundaries and functions, so that every driver has the same look and feel, but it considers also every specific detail. Subclassing may result on finding the minimal common denominator while a factory creates the class based on the "DNA". Also something to consider would be to put the interface parameters (e.g. board specific pin assignments, which memory to use, how to handle the frame queue, etc.) in a file, for example a toml.

Let me know what do you think about this. I could support/contribute the factory part of the driver in Micropython. I haven't done this before, but I don't see why this should not work out. Contributing to the driver in c... Well, my confort Zone is for sure not c, although I learned a lot while writing my API. I could at least take a look at the code and be just 2 eyes more..

cnadler86 commented 4 days ago

and I also want to mention that component you linked to only supports the P4 which I have not been able to locate where to buy one.

Yes, this is probably a work on progress. But they are at least starting to separate the camera bus from the rest. With your last comment I got then the idea. Thanks. And it totally makes sense 😀

kdschlosser commented 4 days ago

They do need to be pulled apart so it allows for expanding without needing to modify the bus portion of the drivers when you want to add a camera...

With respect to the camera drivers. If they are in Python code there would be a base class called CameraBase or something to the like. The base class would implement finalizing setting for the bus driver. Image settings would not be added to the base driver those would be added at the camera driver. That's because not all cameras support all the same image settings. The functions for the image settings would exist in the base class but they would simply raise NotImplementedError. in the drivers code for a specific camera they would override the methods in the base class if they do support specific image settings.

I have to read up on the camera IC's to see what kind of things can be done. I don't think that everything has been made available in the esp32-camera component. One of the things I would like to get is a list of all of the supported display resolutions.

From what I have read so far is that the ESP32s2 and ESP32s3 support DVP, SPI, and USB cameras and the P4 adds CSI to that list.

The S2 and S3 supper 8 or 16 lane DVP but the esp32-camera component only supports 8 lanes. The component only supports DVP cameras as well. The bus needs to be separated from the camera portion of the driver. A lot of cameras support more than one type of bus so there needs to be the ability to attach the camera driver to a different bus driver.

kdschlosser commented 4 days ago

If you have linux running either in a VM, WSL or direct you can give that display library I wrote a test drive.

clone this repo

https://github.com/lvgl-micropython/lvgl_micropython

the run the following commands..

this is for Ubuntu

cd lvgl_micropython

sudo apt-get install build-essential libffi-dev pkg-config cmake ninja-build gnome-desktop-testing libasound2-dev libpulse-dev libaudio-dev libjack-dev libsndio-dev libx11-dev libxext-dev libxrandr-dev libxcursor-dev libxfixes-dev libxi-dev libxss-dev libxkbcommon-dev libdrm-dev libgbm-dev libgl1-mesa-devm libgles2-mesa-dev libegl1-mesa-dev libdbus-1-dev libibus-1.0-dev libudev-dev fcitx-libs-dev libpipewire-0.3-dev libwayland-dev libdecor-0-dev

python3 make.py unix DISPLAY=sdl_display INDEV=sdl_pointer

That will take care of building the library.

Once compiling is finished make the compiled binary executable by using the command that is seen on the last line of the build output.

place the attached file into the same folder as the binary and remove the .txt extension from the filename and run the following command..

sdl_test.py.txt

./lvgl_micropy_unix sdl_test.py

You can look at the code in the attached file so you can get an idea of how it work. It supports SPI, I8080 and RGB bus connections and any display driver is able to be used with any bus. I wrote in touch drivers that work in a similar fashion. I reworked the MicroPython SPI driver so the SDCard is able to be on the same bus as a display and or a touchscreen. You no longer have to manage the CS lines for SPI devices and there is no need to reinitialize the bus if you want to change to sending/receiving from a different device. It's all handled in C code. The display drivers don't really care what the bus driver is because the bus drivers almost have an identical API so all of the function names are the same and require the passing of the same parameters to them.