google-ai-edge / mediapipe

Cross-platform, customizable ML solutions for live and streaming media.
https://ai.google.dev/edge/mediapipe
Apache License 2.0
26.98k stars 5.1k forks source link

Performance over grayscale/Infrared input #2008

Open lghasemzadeh opened 3 years ago

lghasemzadeh commented 3 years ago

Hello,

Do the tensorflow face landmarks detection and Iris landmarks models work on Infrared/grayscal image/video? I have a camera which captures Infrared streams, can I run this model over the infrared videos that I recorded with that camera previously? does the model work the same as when it works with RGB videos or real-time RGB streams. @tensorflow @tensorflow-models/face-landmarks-detection @tensorflow-models/facemesh here is the link to the library: https://blog.tensorflow.org/2020/11/iris-landmark-tracking-in-browser-with-MediaPipe-and-TensorFlowJS.html

Actually I want to apply the model/algorithm over infrared/grayscale stream real-time.

Thank

sgowroji commented 3 years ago

Hi @lghasemzadeh, We would like to hear it from you. We have not tried as mentioned above. Please share your observation.

google-ml-butler[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you.

sgowroji commented 3 years ago

Closing this ticket as we don't hear any update from the reporter!

google-ml-butler[bot] commented 3 years ago

Are you satisfied with the resolution of your issue? Yes No

lghasemzadeh commented 1 year ago

Hello,

I have tested MediaPipe FaceMesh on infrared streams, and it works. The performance is not as perfect as with RGB, but it functions without major issues. Infrared image is a type of grayscale image and it has only one channel

1) Now the problem is that I want to understand how FaceMesh can work on an image with 1-channel, while it is trained on RGB (3-channel) input. 2) During the pre-processing steps or training, is there any step where RGB is converted into grayscale? I did some research on this topic a while ago, but unfortunately, I cannot find those resources anymore. 3) would you please share links to the papers/models/details of the facemesh solution to figure out this behavior?

I am waiting for your response.

Thank you.

rohitgarud commented 1 year ago

@lghasemzadeh, Maybe stacking the single-channel matrix three times to create 3-channel image might help. Which camera are you using to capture IR stream?

lghasemzadeh commented 1 year ago

Hello @sgowroji, I am waiting for your responce.

@rohitgarud there is no problem in fitting, the library works, but I want to know why?

rohitgarud commented 1 year ago

@lghasemzadeh as you mentioned the performance with the single-channel images is not as good as RGB, maybe performance can improve by staking the single channel to create something like pseudo-RGB.. I know color channels don't work like that but it might improve the performance.. I think the amount of monochrome data used for training (if that's the case) will be much less than RGB data.. let's wait for @sgowroji to give a definitive answer

khengari77 commented 1 year ago

@lghasemzadeh

Infrared image is a type of grayscale image and it has only one channel

I doubt that infrared has one channel. It should have 3. I ran into the same problem yesterday and after reading the SolutionBase.py I found that it actually checks for the number of channels and if it's not 3 it will raise an exception. I think that the camera you are using outputs 3 gray channels that are equal to each other giving the effect of 1 grayscale channel. Nonetheless I have an infrared camera lying around and I will try to test it as soon as possible.