I'm trying to replicate the results on the KIT dataset, but I've observed that the training and validation classification accuracy of the Masked Transformer and Residual Transformer are not very high. Specifically, the validation classification accuracy is extremely low (below 20% for Masked Transformer and below 10% for Residual Transformer), and the animation results are also unreasonable. Is this normal?
This is the tensorboard logging of Masked Transformer:
And this is the tensorboard logging of Residual Transformer:
I'm trying to replicate the results on the KIT dataset, but I've observed that the training and validation classification accuracy of the Masked Transformer and Residual Transformer are not very high. Specifically, the validation classification accuracy is extremely low (below 20% for Masked Transformer and below 10% for Residual Transformer), and the animation results are also unreasonable. Is this normal? This is the tensorboard logging of Masked Transformer:
And this is the tensorboard logging of Residual Transformer:
![b62dfbda0c299cbffd025964516436a](https://github.com/EricGuo5513/momask-codes/assets/53321894/f3d1fa51-7a28-41ba-ba6c-84878850bfc3)