facebookresearch / fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
MIT License
30.38k stars 6.4k forks source link

How to reproduce the speech data2vec result? #4527

Open yangjiabupt opened 2 years ago

yangjiabupt commented 2 years ago

I follow the the instructions in page (https://github.com/facebookresearch/fairseq/tree/main/examples/data2vec) to reproduce the speech data2vec result.

And i have got the pretrained model.

Then start to use "fairseq-hydra-train \ distributed_training.distributed_port=$PORT \ task.data=/path/to/data \ model.w2v_path=/path/to/model.pt \ --config-dir /path/to/fairseq-py/examples/wav2vec/config/finetuning \ --config-name base_100h common.user_dir=examples/data2vec" to finetune.

First of all , fairseq-hydra-train: error: unrecognized arguments: --common.user_dir=examples/data2vec  

Then , I use "fairseq-hydra-train \
task.data=/path/to/data \
model.w2v_path=/path/to/model.pt \
--config-dir /path/to/fairseq-py/examples/wav2vec/config/finetuning \
--config-name base_100h"  to finetune.
However, the error occurs, "File "/home/research/jiayang/data2vector/fairseq/fairseq/tasks/fairseq_task.py", line 338, in build_model
model = models.build_model(cfg, self, from_checkpoint)

File "/home/research/jiayang/data2vector/fairseq/fairseq/models/init.py", line 102, in build_model "Available models: {}".format(MODEL_DATACLASS_REGISTRY.keys()) KeyError: "'_name'""

It's really confusing. The pretrain model config setting is data2vec. Then the finetune model config is wav2vec?

Is there some mistakes? Looking for help

Abdullah955 commented 2 years ago

i've fixed same issue by adding

common.user_dir=examples/data2vec

after

task.data=/path/to/data

and before --config

try this one

fairseq-hydra-train
distributed_training.distributed_port=$PORT
task.data=/path/to/data
common.user_dir=examples/data2vec
model.w2v_path=/path/to/model.pt
--config-dir /path/to/fairseq-py/examples/wav2vec/config/finetuning
--config-name base_100h