Closed HLasse closed 2 years ago
Hey @HLasse,
Could you increase this parameter: https://huggingface.co/facebook/wav2vec2-large-lv60/blob/main/config.json#L62 to 0.5
and see if it works then? It seems like given the sequence length you are not sampling enough negative targets.
Also it'll be really hard / impossible to do a full pretraining on a single T4 GPU
That works, thanks!
Also it'll be really hard / impossible to do a full pretraining on a single T4 GPU
I know - this was mainly to get an estimate of training time on different hardware setups. Danish wav2vec models coming up soon! :)
Hi. I encountered exactly the same issue. I'm using the Wav2Vec2ConformerForPreTraining model 'facebook/wav2vec2-conformer-rope-large', training on a single NVIDIA TITAN Xp with a very small speech dataset(pilot).
I've already changed the mask_time_prob, but it didn't work for me. The error message I got was the same one above.
Could you guys help me with this problem?? Thank you in advance!!
System Info
Who can help?
@patrickvonplaten
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Pretraining a wav2vec-large model using the documentation under examples/speech-pretraining does not work.
Running the following code (copy-pasted from the README) gives an error due to
model_path_or_dir
not found:I tried using ´facebook/wav2vec-large-lv60' in
model_name_or_path
but receive the following error:The demo script trains without issue. Using the parameters from the demo script and changing
model_name_or_path
from 'patrickvonplaten/wav2vec2-base-v2` to ´facebook/wav2vec-large-lv60´ gives the above error.Training on a single T4 GPU (benchmarking purposes)
Expected behavior