ubicomplab / rPPG-Toolbox

rPPG-Toolbox: Deep Remote PPG Toolbox (NeurIPS 2023)
https://arxiv.org/abs/2210.00716
Other
442 stars 106 forks source link

Face Detection Backends #232

Closed yahskapar closed 9 months ago

yahskapar commented 9 months ago

A (fairly simple) stab at introducing face detection backends into the toolbox. In the future, some additional refactoring of the BaseLoader.py file will likely be needed, especially if more backends get added.

Some quantitative results with POS (HC = Haar Cascade, RF = RetinaFace):

UBFC-rPPG with HC: ===Unsupervised Method ( POS ) Predicting === 100%|███████████████████████████████████████████| 42/42 [01:11<00:00, 1.70s/it] Used Unsupervised Method: POS FFT MAE (FFT Label): 3.9969308035714284 +/- 0.994365555673655 FFT RMSE (FFT Label): 7.5831059532071405 +/- 21.649350686623574 FFT MAPE (FFT Label): 3.8622851481891742 +/- 0.9014916191719023 FFT Pearson (FFT Label): 0.9224921893686093 +/- 0.06103444940234774 FFT SNR (FFT Label): -2.3875030222395335 +/- 1.1356165444835469 (dB)

UBFC-rPPG with RF: ===Unsupervised Method ( POS ) Predicting === 100%|███████████████████████████████████████████| 42/42 [01:19<00:00, 1.90s/it] Used Unsupervised Method: POS FFT MAE (FFT Label): 3.0238560267857144 +/- 0.8826602879469477 FFT RMSE (FFT Label): 6.470351690233616 +/- 19.10398129449891 FFT MAPE (FFT Label): 2.910813268130525 +/- 0.7756710295825437 FFT Pearson (FFT Label): 0.9397976039926306 +/- 0.05403250492289589 FFT SNR (FFT Label): -2.531876496531225 +/- 1.0742929175235254 (dB)

PURE with HC: ===Unsupervised Method ( POS ) Predicting === 100%|███████████████████████████████████████████| 59/59 [02:16<00:00, 2.31s/it] Used Unsupervised Method: POS FFT MAE (FFT Label): 3.6720405190677967 +/- 1.4630957700514682 FFT RMSE (FFT Label): 11.822951673836915 +/- 66.86540992165988 FFT MAPE (FFT Label): 7.248782385643253 +/- 3.031084749953558 FFT Pearson (FFT Label): 0.8783814244636281 +/- 0.0633073917079514 FFT SNR (FFT Label): 6.866575820510611 +/- 0.9486667657630489 (dB)

PURE with RF: ===Unsupervised Method ( POS ) Predicting === 100%|███████████████████████████████████████████| 59/59 [01:36<00:00, 1.63s/it] Used Unsupervised Method: POS FFT MAE (FFT Label): 3.0314817266949152 +/- 1.3329728950598538 FFT RMSE (FFT Label): 10.678111680356963 +/- 62.29386026026936 FFT MAPE (FFT Label): 6.087798394258333 +/- 2.79597728460723 FFT Pearson (FFT Label): 0.8999647759432344 +/- 0.05774465906212253 FFT SNR (FFT Label): 6.798964613851172 +/- 0.9636205610753447 (dB)

Some qualitative sample frames:

PURE, with HC: Screenshot from 2023-11-29 06-53-54

Screenshot from 2023-11-29 06-54-58

PURE, with RF: Screenshot from 2023-11-29 06-56-21

Screenshot from 2023-11-29 06-56-44

Some additional thoughts from my end:

yahskapar commented 9 months ago

@xliucs @girishvn, I should note, aside from the face detection backends and some minor refactoring, I also introduced an additional progress bar at the model inference step that was previously missing. Adding this will give users active updates on model inference progress.