Closed qianlivia closed 1 year ago
I have updated the HuBERT checkpoint with the pre-defined dictionary. Please check if this works
I have updated the HuBERT checkpoint with the pre-defined dictionary. Please check if this works
Thank you, this solved it!
🐛 Bug
I tried to load the checkpoints (Fisher HuBERT model and k-means model) from
Dialogue Speech-to-Unit Encoder for dGSLM: The Fisher HuBERT model
and got the following error:
No such file or directory: '/checkpoint/ntuanh/experiments/dialogue-LM/hubert/fisher-vad-10s-separated/kmeans/hubert_iter2/km500/dict.km.txt'
After trying to fix it with an external k-means dictionary file, I got dimension mismatch in the weights.
To Reproduce
Note: I tried the other method as well (by using quantize_with_kmeans.py) but got the same error.
Trace:
Code sample
Expected behavior
I assume that the path to the k-means dictionary file should be a relative path and that the dictionary file should be included in the HuBERT model checkpoint itself.
Note: As a temporary measure, I tried to overwrite the path that causes the error with dictionary1 in Generative Spoken Dialogue Language Modeling which has exactly 500 elements. After this, I got another error:
Based on this, I believe that there is also some dimension mismatch between the pre-trained weights and the actual model weights.
Environment
pip
, source): sourcepip install --editable ./