samsad35 / VQ-MAE-S-code

A Vector Quantized Masked AutoEncoder for speech emotion recognition
https://ieeexplore.ieee.org/document/10193151
GNU Affero General Public License v3.0
15 stars 1 forks source link

training time for pre-train #4

Closed crzdg closed 6 months ago

crzdg commented 6 months ago

Hi

Interesting and great work. Can you elaborate a bit about training time for this? How long did it take to pre-train the VQ-VAE and then the VQ-MAE-S?

Further, in the paper I think there is a miss alignment for the text and figure 1. For the the output of the VQ-VAE encoder the text refers to $\mathbb{Z}$, were as the figure refers to $\mathbb{R}$ through out. I guess the figure includes a copy-paste error.

crzdg commented 6 months ago

Okay just saw that the image in this repos README actually is correct / fixed.

samsad35 commented 6 months ago

Hi @crzdg

Thank you for your interest in our work.

crzdg commented 6 months ago

Thank you very much for the answers.