Open zzzzhuque opened 5 years ago
HI @ZHUTAO142857 , sorry that I didn't notice this issue before.
I performed the audio-visual recognition task (word classification for LRW) as written in the paper and these are the accuracies of the classification using only video or audio or combination.
Hi, after reading the paper, I am confused about the table 3. What is the meaning of visual acc, audio acc and combine acc? How did you calculate the result of 67.5%, 91.8%, 95.2%?