l1npengtul / nokhwa

Cross Platform Rust Library for Powerful Webcam/Camera Capture
Apache License 2.0
521 stars 132 forks source link

0.11 Roadmap #86

Open l1npengtul opened 1 year ago

l1npengtul commented 1 year ago
l1npengtul commented 1 year ago

FF Rework in 500ba629

RReverser commented 1 year ago

Would love to see the async camera. I see there is some progress on the main branch, but will wait for the release.

OlivierLDff commented 1 year ago

I would love to see 0.11 released to get in sync with that in my projects. My recommandation after tinkering a bit with your library:

l1npengtul commented 1 year ago

I agree with your changes, but the only issue with completely dropping async is that I want nokhwa to be able to be used in a WASM environment, so keeping async around would be useful for that purpose.

OlivierLDff commented 1 year ago

I didn't say to drop async, just to keep it out of bindings* crate. I'm not familiar with WASM environment, so I don't know what are the limitation there.

RReverser commented 1 year ago
  • Drop the decode part.

I strongly disagree sith this, it would make crate much less useful. Most users don't care about which decoder to use, and besides there are more formats than JPEG. We just want to get RGB no matter how it's achieved.

If someone really needs flexibility, decoding could be optional feature, but IMO it's one of those things that should be included.

l1npengtul commented 1 year ago

Im trying to move to a more flexible decoder interface. I still say that jpeg should stay as it is - a default feature.

OlivierLDff commented 1 year ago

I see that senpai branch wants to support vp8,9 h264,... It seems that a lot of stuff to maintain in the future

l1npengtul commented 1 year ago

Only capture - decoding will be on the user. This is why I am redoing how decoding is done.

OlivierLDff commented 1 year ago

https://github.com/l1npengtul/nokhwa/blob/cab61c5750085df9ebb812269caf05f48d14b51d/nokhwa-core/src/types.rs#L1423-L1425 I can see that each new support seems harder and harder ^^. On my side I'm not using the decoding part of that library, since I might get image from genicam cameras too.

Decoding is done on my side with a few lines:

In the end putting that in user code is just a match, not that long. The fact you want RGB only matters if you want to modify the image or display the image to the user immediatly without sending it to the network. In thoose case you already have some kind a library that can do color conversion. And with rust magic jpeg decoding is just picking a library, and 2 lines of code.

RReverser commented 1 year ago

The fact you want RGB only matters if you want to modify the image or display the image to the user immediatly without sending it to the network.

Well, yes. Or re-encode into a more efficient format.

In the end putting that in user code is just a match, not that long.

You could say the same about the library as a whole. "Just do match on different webcam backends."

The main value of this library is that it abstracts all platforms and formats for you, so you just get an image you can analyse and manipulate easily. If you want lower-level access, IMO you should use the relevant OS APIs directly, but for high-level use-cases being able to abstract all of this away and not worry about cross-platform or format compatibility is extremely valuable.

OlivierLDff commented 1 year ago

I would argue that abstracting uvc camera backend and abstract image format should be 2 different crates. Breaking the maintenance cost in 2.

If you take a library like https://github.com/cameleon-rs/cameleon that support 266 format, it would only hurt maintainer to try to support every format.

What I see is that by trying to support to many features combinations, it become really to maintain and iterate fast without breaking. I would love you to help you @l1npengtul maintaining this crate, help you stabilize the API since I want to use this crate for production. But we need clear roadmap if you want any help.

Again I haven't look at the senpai branch way of dealing with decoder the issue might already been solved.

The main value of this library is that it abstracts all platforms and formats for you

To finish I will say this is the difference between a framework & a library. A library should do once thing, in a good way with a small API. A framework tries to unify it all. Building framework is hard and takes lots of engineering effort. This is definitly not the way to go when a project only has a single maintainer ;)

RReverser commented 1 year ago

Again I haven't look at the senpai branch way of dealing with decoder the issue might already been solved.

I mean, if kind of is, at least for popular formats, and works well enough on the 0.10 branch, that's why I'm against breaking that functionality.

And you keep saying I could just use a different crate, but it's not like there is an easy alternative, because crates like image or image2 don't cover formats specific to webcams, and monstrosities like ffmpeg could but they are too large and, well, they already support webcam capture on their own, so if someone wants to use them, they could already.

UVC separately was already solved in libuvc (which has Rust bindings by the same author), but - and I know I'm repeating myself - the main value of this crate is precisely in high-lelel API that allows app developers to get images from any webcam not worrying about testing different platforms or decoding vendor-specific raw data.

If this is just proposing to split crates and make decoding optional it's one thing, but breaking existing functionality and making the library return raw data, leaving the hard parts (dealing with multitude of vendors and formats) to app developers would significantly cripple usefulness of the library.

l1npengtul commented 1 year ago

I'm going to step in here. While I do agree with @OlivierLDff that adding more decoder support would increase the maintenance burden, we should not break existing functionality. Nokhwa exists to let rust users to simply forget about their OS/Environment and simply get frames, as @RReverser says. The ideal solution is to keep MJPEG and YUV/NV12 since those are the most common formats that most people care about and then adding a flexible decoder interface in order to let people create their own decoders (e.g. VP9). (JPEG should be kept as-is, kept behind a feature flag but on by default on native platforms)

As for the 266 formats, there is a difference of FrameFormat and decoder support. Adding a FrameFormat is pretty simple, as its usually a platform-specific FourCC code away, while the actual complexity comes from the decoder implementation.

OlivierLDff commented 1 year ago

I get the value of having decoders. But I think a stricter separation between capture & decoding would be nice. It is kind of disturbing that you need to think of decoder format, if you just want to open a camera. And maybe YUV/NV12 code can be refactored to rely on https://github.com/aws/dcv-color-primitives to delegate complexity and maintenance to an external crate.

RReverser commented 1 year ago

And maybe YUV/NV12 code can be refactored to rely on aws/dcv-color-primitives to delegate complexity and maintenance to an external crate.

That looks promising.

l1npengtul commented 1 year ago

I pushed a commit that updates how decoding is done. In short - There is a new Decoder trait that allows decoders to specify their output and compatible inputs, allowing for easy decoding and implementation of custom decoders.

Tremoneck commented 8 months ago

Have you considered returning a ImageBuffer from the image crate for your decoding functions. This would make working with the decoded Image clearer while also making the Types more expressive than a Vec.

l1npengtul commented 1 week ago

I will try my best to release at least a beta for 0.11 by the end of this year. Sorry everyone for the long wait, but I finally feel motivated to work on this project again.

zabackary commented 5 days ago

Thank you so much! Good luck on working on it! This is really the best webcam library available for Rust right now and the API is quite easy to use. Don't worry about the long wait; it's great that you're willing to maintain this again (and, since this is open source, there's not really a firm obligation for you to do so anyways, though it's of course nice for the library users).

If it helps with motivation, I suppose I can share my use case: I'm using/used this library to create a custom-designed photo booth (much cheaper than other solutions!) for my school's festival. It really helps to be able to have something like this where I can write it on Linux and if I have to use Windows to run the final product then it will simply work.

Just one question: Is there any reason why the main branch is called senpai? It's much more playful (in a good way) than simply main, but it is a little confusing. Also, would it help to send in PRs to try to, for example, get some simple compiler errors fixed in senpai?

l1npengtul commented 5 days ago

Just one question: Is there any reason why the main branch is called senpai? It's much more playful (in a good way) than simply main, but it is a little confusing.

The "too preoccupied of whether I could that I didn't stop to consider whether I should" classic.

Also, would it help to send in PRs to try to, for example, get some simple compiler errors fixed in senpai?

Not "simple" anymore, the internal APIs are getting refactored and senpai wont build for a while - for example, Capture isn't a thing anymore, in service of making the main Camera struct easier to implement variations of e.g. AsyncCamera.