Input dimension for the Keras model

sign-language-processing / detection-train

Training a sign language detection model

7 stars 2 forks source link

Input dimension for the Keras model #2

Closed hshreeshail closed 2 years ago

hshreeshail commented 2 years ago

I had a doubt regarding the input dimensions of the LSTM model. I will refer to the model trained on MediaPipe holistic keypoints rather than the OpenPose keypoints. 1] I see the input dimension as (None, None, 75). I get that the first variable dimension (i.e., "None") is for the batch size. What is the purpose of the second "None"? 2] Shouldn't the input dimension be 225 = 75*3 (x,y,z co-ordinates for 75 joints)? Why is it 75 instead?

AmitMY commented 2 years ago

(None, None, 75) = (Batch, Frames, 75)
It is 75 because we input into the model the "optical flow" of the keypoints.

Please look at section 3.1 in our paper https://arxiv.org/pdf/2008.04637.pdf

hshreeshail commented 2 years ago

I see. Thanks for the clarification.