This repository contains the implementation of various models for video recognition. Those models were benchmarked on the LSFB dataset depicting 395 sign from the Belgian Sign Languages. Other sign language dataset were alsos used for comparison (MS-ASL and GSL)
The accuracy obtained on the models are :
CNN + RNN | C3D | I3D | |
---|---|---|---|
LSFB-ISOL | 3.6% | 6.4% | 51% |
MSASL-100 | 0.8% | 1.3% | 53% |
GSL | 6.1% | 8.6% | 36.5% |