facebookresearch / textlesslib

Library for Textless Spoken Language Processing
MIT License
518 stars 50 forks source link

omegaconf.errors.ValidationError: Value '50.0' could not be converted to Integer #37

Open yzou2 opened 1 month ago

yzou2 commented 1 month ago

Hey folks, I have met an omegaconf.errors.ValidationError, as shown in screenshot, when I used ("mhubert-base-vp_mls_cv_8lang", "kmeans-expresso", 2000) model in https://github.com/facebookresearch/textlesslib/tree/main/examples/expresso

Does any one know how to fix it?

Code I ran:

import torchaudio
from textless.data.speech_encoder import SpeechEncoder

# Available models
EXPRESSO_MODELS = [
    ("hubert-base-ls960-layer-9", "kmeans", 500),
    ("hubert-base-ls960-layer-9", "kmeans-expresso", 2000),
    ("mhubert-base-vp_mls_cv_8lang", "kmeans", 2000),
    ("mhubert-base-vp_mls_cv_8lang", "kmeans-expresso", 2000),
]

# Try one model
dense_model, quantizer_model, vocab = EXPRESSO_MODELS[3]

# Load speech encoder and vocoder
encoder = SpeechEncoder.by_name(
    dense_model_name = dense_model,
    quantizer_model_name = quantizer_model,
    vocab_size = vocab,
    deduplicate = False, # False if the vocoder doesn't support duration prediction
)

Error message

Screenshot 2024-07-21 at 11 21 57 PM