This repository proposes an implementation of a Sign Recognition Model using the MediaPipe library for landmark extraction and Dynamic Time Warping (DTW) as a similarity metric between signs.
pip install -r requirements.txt
The architecture of the videos/
folder must be:
|data/
|-videos/
|-Hello/
|-<video_of_hello_1>.mp4
|-<video_of_hello_2>.mp4
...
|-Thanks/
|-<video_of_thanks_1>.mp4
|-<video_of_thanks_2>.mp4
...
To automatically create a small dataset of French signs:
ffmpeg
(for MacOS brew install ffmpeg
)python yt_download.py
yt_links.csv
if needed
N.B. The current dataset is insufficient to obtain good results. Feel free to add more links or import your own videos
python main.py
In this project a HandModel has been created to define the Hand gesture at each frame. If a hand is not present we set all the positions to zero.
In order to be invariant to orientation and scale, the feature vector of the HandModel is a list of the angles between all the connexions of the hand.
The SignModel is created from a list of landmarks (extracted from a video)
For each frame, we store the feature vectors of each hand.
DTW is widely used for computing time series similarity.
In this project, we compute the DTW of the variation of hand connexion angles over time.