torinmb / mediapipe-touchdesigner

GPU Accelerated MediaPipe Plugin for TouchDesigner
MIT License
787 stars 37 forks source link

Image Resolution and Flop Parameter #77

Closed joansc closed 7 months ago

joansc commented 7 months ago

Hello all,

First of all, thanks for this amazing plugin!

Would be nice if there's a parameter on the component that let's you choose the resolution. Even more, if you could choose the resolution for the detection and the resolution for the output camera image. As an example, now I'm trying an AR filter idea where I have a fullHD webcam (I want to try also a 2k webcam). Would be nice if for instance the detection was done on a lower res image like 1280x720 but still getting the fullHD original image from the webcam as a TOP (now you are constrained to 1280x720 and same res for detection and output TOP image).

Also another parameter that would be handy would be a flop parameter. For instance, coming back to the AR filter idea, to achieve a similar "selfie" ratio, I am manually flipping the webcam so the res would be 1080x1920 although the camera is still detected as 1920x1080. If the detection (face landmarks) is done on a flipped image I observed it's less accurate. Would be cool if with this Flop toggle the detection would be done on the 1080x1920 image.

Many thanks,

Joan

domisjustanumber commented 7 months ago

Hi @joansc thanks for the kind words 😄

The answer to your questions is - yes you can! Although a little work is needed.

Firstly - MediaPipe was trained on webcam type images at a pretty low resolution (most are 256x256, some are 512x512), so in order for it to work well and get the best tracking, the input images need to be approximately the same shape and size. I fixed it at 1280x720 as that seems like a good compromise between useable output and not too much bigger than the source images.

If you want to use 2K or 4K or whatever size camera images you can also do that using the "Virtual webcam" chain in the example project, although when I was looking at it yesterday I realised it needed a bit of modification to do what you are talking about. I pushed some changes to the example project that will be in the next release to make it easier to use, but the general idea is:

You then run SpoutCam (or OBS if you're on Mac, sorry, it's more complicated) to receive the 720p Spout stream and present it to MediaPipe as a virtual webcam.

MediaPipe can then process a 720p video that it is expecting, and you can use the output of the Cache TOP to re-sync the MediaPipe data with your original video feed.

Once your original camera feed and the MediaPipe data is synchronised back up, you can then crop/rotate/flip/flop the image and data however you want.

joansc commented 7 months ago

Thanks for the detailed response @domisjustanumber !! Unfortunately, I am not considering using the "virtual webcam" chain because from tests I have done so far in different computers (NVIDIA rtx and so on), SpoutCam introduces a delay of about 250ms between the detection and the camera image that makes AR filters not usable. This delay is not happening when accessing the camera directly from javascript. That's why I was wondering if all these changes could be done from the js code...

Thanks for your time,

Joan

domisjustanumber commented 7 months ago

Hmm that's interesting. On my Windows machine I'm only getting a 1 frame delay with SpoutCam (MediaPipe adds another 3 frame minimum, plus processing time. On something like an RTX 3070 the processing time is usually less than 1 frame, so a total of <4 frames latency on the whole setup.

If you're seeing 250ms of delay with SpoutCam, I'd make sure you have the latest versions (or maybe it gets slowed down by CPU load?)

joansc commented 7 months ago

mmm interesting, gonna reply you back in the next days with some further testing on my side. Thanks!

joansc commented 7 months ago

Coming back to this, indeed you are right. I downloaded last versions of this plugin, spoutCam and updated nvidia drivers and all seems good now :)