Closed scottcali closed 5 months ago
Hello @scottcali ! Full frame model takes a full frame as input and classifies only one gesture shown.
The detection model also takes as input a full frame from a 224x224 video, but predicts all gestures shown for their bounding boxes
Hello,
-Scott