UVC Output as a Pipeline Node

Luxonis-Brandon commented 3 years ago

Start with the `why`:

In some cases it may be desirable to feed UVC (USB Video Class, e.g. a webcam) output to have compatibility with existing software stacks/etc. To provide instant integration sort of compatibility, particularly if working with closed-source tools (or hard to modify tools) which already work with UVC-compatible inputs.

This becomes increasingly powerful when the UVC Output is a node in the pipeline. As instead of the whole video feed being output as a UVC video stream, any video stream in the pipeline can be output. Take for using an object detector to guide digital PTZ (https://github.com/luxonis/depthai/issues/135), and then outputting this pan/tilt/zoomed stream directly out UVC.

This way, the UVC output would appear to a computer as if someone is actually just moving a camera around.

Move to the `how`:

An initial way to support this is to keep XLINK as is (i.e. used for control and high-speed data sending back/forth), and use one of the total (3) high speed endpoints to be a UVC endpoint. This would allow 1 UVC output.

A longer-term solution may be to develop XLINK support over the USB control endpoint (working at the message level: host request -> device reply), which would then not consume a high bandwidth pipe (i.e. would not consume 1 of the dedicated endpoints) for control. Internal notes here.

In this case, instantiating 3x UVC outputs may be possible.

To start with though, it would be easiest to support 1x UVC output so likely this will be the approach.

Move to the `what`:

Implement a UVC output node in the Gen2 Pipeline Builder (https://github.com/luxonis/depthai/issues/136). This will allow any node which produces video to be output over USB Video Class to a USB host.

Luxonis-Brandon commented 3 years ago

We have this initially working now. Need to see if the code is externally usable now.

dhruvmsheth commented 3 years ago

Eager to test it out

Luxonis-Brandon commented 3 years ago

This should work on Linux, but may have issues on other OSes: https://github.com/luxonis/depthai-python/blob/gen2_uvc/examples/19_uvc_video.py

python3 -m pip install depthai==0.0.2.1+e70becfe1c1908c6148f842006b9986ebab39cab --extra-index-url https://artifacts.luxonis.com/artifactory/luxonis-python-snapshot-local

dhruvmsheth commented 3 years ago

Thanks! Will report how it goes

ORippler commented 3 years ago

Works for me with x64 chip under Ubuntu 18.04 and the above linked snapshot + python file. Tested it out with Zoom.

dhruvmsheth commented 3 years ago

edit: (Glitches). Works now

cafemoloko commented 3 years ago

@Luxonis-Brandon can we close this issue as it seems resolved?

Luxonis-Brandon commented 3 years ago

So let's leave it open for now as it's not formally integrated with DepthAI so far (I think). Actually, let's see what @alex-luxonis has to say on this. He knows better than me.

mx1up commented 3 years ago

I can confirm it works as expected with OBS studio. I will try it out for streaming. I commented out https://github.com/luxonis/depthai-python/blob/e70becfe1c1908c6148f842006b9986ebab39cab/examples/19_uvc_video.py#L15 in order to have autofocus. this sample or code does not seem to be integrated yet in the main distribution

mx1up commented 3 years ago

@Luxonis-Brandon which would be the most efficient video format: YUV420, BGR3 or YV12? (or maybe it does not really matter for the device, but rather which format my video card can render fastest?)

Luxonis-Brandon commented 3 years ago

Thanks for the test and information here @mx1up !

On which is the most efficient, I'm not sure actually. I'm curious to know as well. And I don't know WRT if it would be host/video-card dependent either (but great question/point). @alex-luxonis is probably the most likely among us to know on both.

alex-luxonis commented 3 years ago

At the moment only the YUV420 (YU12) format is exposed. Some programs like guvcview show those additional formats, not sure if that's an issue with those programs, or somehow caused by the USB descriptor. It's listed properly by this command: v4l2-ctl -d /dev/video0 --list-formats

ioctl: VIDIOC_ENUM_FMT
    Type: Video Capture

    [0]: 'YU12' (Planar YUV 4:2:0)

It is not merged yet to mainline, as we don't have a clean way to expose the UVC interface only when the UVC node is added to the pipeline. With the depthai version here, the UVC device always shows up on host, but it operational only with the UVC node linked in the pipeline. (It's not harmful otherwise to the normal operation of XLink VSC interface, for pipelines not using UVC, but might be confusing.)

For a clean integration, we need these (WIP, I think @themarpe is looking at):

pre-boot config, to send the Gen2 pipeline (or a part thereof) to device prior to it booting up the DepthAI FW;
dynamically constructing the USB descriptor, based on the interfaces required by the nodes/config in the Gen2 pipeline.

mx1up commented 3 years ago

thanks for the explanation. FYI: OBS studio does mention "(emulated)" after BGR3 and YV12. I did some simplistic performance measurements (both windowed and fullscreen on an intel chipset) using vmstat and it seems YUV420 and YV12 perform more or less the same while BGR3 is definitely slower.

for example: BGR3 (emulated)

 us  sy  id  wa  st
  5   3  92   0   0
  5   3  92   0   0
  6   4  90   0   0
  6   2  92   0   0
  6   4  90   0   0
  6   3  91   0   0
  6   4  90   0   0

YUV420

 us  sy  id  wa  st
  2   3  94   0   0
  3   4  92   0   0
  2   5  93   0   0
  2   4  94   0   0
  3   4  93   0   0
  3   5  92   0   0
  3   4  93   0   0

YV12 (emulated)

 us  sy  id  wa  st
  2   3  94   0   0
  3   4  92   0   0
  2   5  93   0   0
  2   4  94   0   0
  3   4  93   0   0
  3   5  92   0   0
  3   4  93   0   0

Luxonis-Brandon commented 3 years ago

Oh that's neat! Thanks for the data-points here.

TannerGilbert commented 3 years ago

@Luxonis-Brandon is https://github.com/luxonis/depthai-python/blob/gen2_uvc/examples/19_uvc_video.py still the newest version available and is there already an implementation for the depth output?

alex-luxonis commented 3 years ago

Yes, that is still the latest for now. But soon we'll start to work again on it and integrate it in mainline.

About depth, it may be possible for now to link other streams to uvc.input, but the UVC descriptor is hardcoded to 1920x1080 NV12 (12 bits per pixel), so the host could reject the frames (uvcvideo in newer Linux kernels may do that, and configuring it with nodrop=1 as here could be a workaround, but will need custom frame decoding). We'll also properly configure the UVC descriptor to match with the frame format.

TannerGilbert commented 3 years ago

@alex-luxonis Thanks for the quick reply. I'm looking forward to the integration.

AChangXD commented 3 years ago

@alex-luxonis @Luxonis-Brandon What is the timeline for this working on Windows?

Luxonis-Brandon commented 3 years ago

So we haven't done anything more on it directly. But we've done a ton of bootloader work which is the crux of getting it to work properly. So I don't know when the Windows setup will work better. That said, if you wanted you could get a OAK-SoM-IoT and install it on an OAK-D and it can be pre-flashed with UVC pipeline so that then it will work with Windows now @AChangXD . https://shop.luxonis.com/collections/iot

Or simpler you could get one of these and flash them with a UVC pipeline, and they'll boot up right away as UVC so will work fine with Windows as well.

Once the underpinnings are there (the total rework of the bootloader), I'm just not sure yet how long doing the refactoring on the UVC node will take to get Windows to be OK with it. The flashed-device route will definitely take care of it though, and will work now.

Thoughts?

Thanks, Brandon

aghobrial commented 3 years ago

Is there any chance the UVC branches are updated to work on POE OAKs?

themarpe commented 3 years ago

@afakhry01 PoE OAKs aren't priority for UVC integration, but could still be done. Right now we are in progress of making this work seamlessly for USB models. Afterwards this could be looked into as well.

That said - the use case with current models not having exposed USB isn't there as is for other models. If you do access the internals of the device to get to USB connector, you can use it as USB device by modifying the boot switches to 0x16. But this is a bit out of scope of regular usage and I only advise doing that with caution. Also flashing the UVC pipeline in this case won't work as expected.

ES-Alexander commented 2 years ago

Will/does the UVC node/pipeline support common camera settings through v4l2-ctl? And even better, any chance it'll support changing model parameters once it's in its final form (e.g. setting disparity confidence for a depth node)? This would make configurable deployment to users much simpler, since it could present as a normal camera with the processing pipeline pre-loaded and relevant settings tuneable, with no required programming/running code on the users' part.

themarpe commented 2 years ago

CC: @alex-luxonis on UVC capabilities in that regard.

@ES-Alexander this is a good suggestion - this should be doable to create using our platform. This is what we strive towards. That said, we might also make this kind of model ourself - with flash onboard and preflashed with a full-blown UVC capabilites to be able to use it directly in other software along with maybe modifying some options using before mentioned v4l2-ctl.

Lets keep this debate going - what is the usual case of disparity/depth map consumption in other software that take UVC?

ES-Alexander commented 2 years ago

@themarpe

what is the usual case of disparity/depth map consumption in other software that take UVC?

I'm not sure of anything that meaningfully would take depth through UVC - I was more thinking that a plug-and-play product with a depth-aware filtering pipeline could be useful, but as with all filters it would be nice to be able to tweak some parameters while using it (in the same way that it's often useful to tweak normal camera brightness + contrast and other settings, to set up profiles for a set of operating conditions). It's of course possible to create the plug-and-play nature with a custom setup that allows setting both standard UVC controls and separate model controls, and handles the details under the hood, but that requires custom software running on the user's computer, whereas there are already existing tools for controlling UVC parameters that would be able to auto-detect new ones if we were able to add them.

The main benefit is easier integration for products that don't need much modification or a high-speed response from the host device, but which would benefit from the user being able to tweak some settings.

It may also be possible to do some kind of hybrid setup, where a program sets up the pipeline (and optionally provides any required high-speed responses to the camera), but user settings are exposed as UVC settings, in which case custom code is being run but at least an interface doesn't need to be developed. That also solves the 'no flash memory on OAK-D-Lite' issue.

Another alternative is making a standardised depthai settings API, and some software to interface with it, which is probably a good idea in itself (could be completely unrelated to UVC).

ES-Alexander commented 2 years ago

products that don't need much modification or a high-speed response from the host device, but which would benefit from the user being able to tweak some settings.

An easy example of this would be for PTZ amounts (see #135). Ideally that would be supported out of the box if doing a UVC boot, since it's functionality that anybody using a depthai camera as just a camera would benefit from.

themarpe commented 2 years ago

@ES-Alexander good points.

Regarding

I was more thinking that a plug-and-play product with a depth-aware filtering pipeline could be useful

So a webcam with background bluring in a sense?

The main benefit is easier integration for products that don't need much modification or a high-speed response from the host device, but which would benefit from the user being able to tweak some settings.

An easy example of this would be for PTZ amounts

With above mentioned examples the main issue I see is that they might be too specific. As our platform is very versatile, this kind of default behavior might suit very few people. And regarding configurability through UVC, as soon as a 3rdparty application is necessary to configure those in runtime, it could as well be DepthAI where all configurability is possible. But I do agree that its sometimes easier to just use existing software to control things.

The way I see it is if we do go forward with this, to expose the "base" of the device, like all camera streams or maybe add depth as well (if possible to consume by certain apps, or maybe even by opencv, ...), with very base configurability through UVC options, that aren't too specific.

In case of having more specific examples running, one could flash (on models with flash onboard) a custom DepthAI application, where they could expose any kind of data and control any kind of mode / settings using the UVC settings and then managing the pipeline nodes by sending configuration messages using a Script node.

Thoughts?

ES-Alexander commented 2 years ago

I was more thinking that a plug-and-play product with a depth-aware filtering pipeline could be useful

So a webcam with background bluring in a sense?

Yeah sure, that's a good example :-)

With above mentioned examples the main issue I see is that they might be too specific. As our platform is very versatile, this kind of default behavior might suit very few people.

I'm not recommending all depthai devices get shipped with a set model on them, I'm saying it could be nice if it was a possibility to configure a running model through UVC controls - especially for deployed use-cases where someone makes a model which they load onto a suitable device, then provide that device to someone else who only needs it for that purpose, in which case accessing it as a normal camera but with extra settings for the programmed in extra functionality could be handy :-)

maybe add depth as well (if possible to consume by certain apps, or maybe even by opencv, ...)

Not what I was going for, but definitely an interesting idea :-)

In case of having more specific examples running, one could flash (on models with flash onboard) a custom DepthAI application, where they could expose any kind of data and control any kind of mode / settings using the UVC settings and then managing the pipeline nodes by sending configuration messages using a Script node.

Yeah, this sounds like the kind of thing I was thinking of. Makes sense that it would require a script node to process and handle the UVC settings requests, but more generally I just wasn't sure how feasible it would be to actually get information from inputted UVC controls to somewhere useful outside the UVC node, so thought I'd ask about whether it would be a possibility (hence my original comment) :-)

themarpe commented 2 years ago

@ES-Alexander I see, you are suggesting basically to have the capability of flashing the device with a UVC enabled DepthAI Application and not one necessarily being supplied by the device itself already.

In that case, that overlaps nicely - with adding flash onboard both cases could actually be done.

We are discussing internally of adding some small amounts of flash to non IoT devices as well to provide seamless UVC use case as well as the capability of flashing your own. TBD.

so thought I'd ask about whether it would be a possibility

Yeah, I presume many things could already be done currently (if not counting the general UVC node support), if we send a generic UVC control message from UVC node, and you link that into Script node, parsing it as it seems fit, and issue config messages from there on, like reconfiguring ImageManip for instance, or changing some StereoDepth parameters, etc...

luxonis / depthai