mpc001 / Lipreading_using_Temporal_Convolutional_Networks

ICASSP'22 Training Strategies for Improved Lip-Reading; ICASSP'21 Towards Practical Lipreading with Distilled and Efficient Models; ICASSP'20 Lipreading using Temporal Convolutional Networks
Other
395 stars 102 forks source link

Not able to evaluate visual-only performance using the pre-processed npz files #60

Closed Rathi4research closed 1 year ago

Rathi4research commented 1 year ago

Hi, I did the following steps but got the below error. step-1: I executed the crop_mouth_from_video.py from the preprocess directory and got the .npz files. step-2: Executed the script related to visual-only performance as shown in screen shot 1 and got the error as shown in screen shot 2

screenshot-1 image

screenshot-2 image

Also attaching the full log trace as a text file. Lipreading_Error.txt

mpc001 commented 1 year ago

Hi @Rathi4research, thank you for providing the detailed log. Can you please check if any .npz files have a shape of (29, 96, 96)? You can load by np.load(filename)["data"].shape, where filename is a .npz file. Also, can you please print the shape of frames at this link? The frames should have a shape of video length x height x width

Rathi4research commented 1 year ago

Thank u so much for your reply. I was able to fix this by changing the input argument from "parser.add_argument('--convert-gray', default=False, action='store_true', help='convert2grayscale')" to "parser.add_argument('--convert-gray', default=True, action='store_true', help='convert2grayscale')"