lucasnewman / best-rq-pytorch

Implementation of BEST-RQ - a model for self-supervised learning of speech signals using a random projection quantizer, in Pytorch.
MIT License
71 stars 7 forks source link

Qustion about the implementation #7

Open ddicee opened 6 months ago

ddicee commented 6 months ago

Hi Lucas,

Thanks for sharing your implementation of the framework. I don't quite get it why the labels are passed into the conformer model instead of the original data. To my understanding, the conformer is used to encode the original data and predict the corresponding labels (indices in the codebook), so the input here shouldn't be the labels, right?

https://github.com/lucasnewman/best-rq-pytorch/blob/b4b0d8df333a88e59fcebac91d4b85db0d01c332/best_rq_pytorch/best_rq.py#L144-L149