Testing Textless S2ST with pre-trained models

facebookresearch / fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

MIT License

30.5k stars 6.41k forks source link

Testing Textless S2ST with pre-trained models #4504

Open dina-adel opened 2 years ago

dina-adel commented 2 years ago

Hello, I am trying to test the Textless S2ST model as mentioned here. However, I encountered an error while running this line: python examples/speech_recognition/new/infer.py \ --config-dir examples/hubert/config/decode/ \ --config-name infer_viterbi \ task.data=${DATA_DIR} \ task.normalize=false \ common_eval.results_path=${RESULTS_PATH}/log \ common_eval.path=${DATA_DIR}/checkpoint_best.pt \ dataset.gen_subset=${GEN_SUBSET} \ '+task.labels=["unit"]' \ +decoding.results_path=${RESULTS_PATH} \ common_eval.post_process=none \ +dataset.batch_size=1 \ common_eval.quiet=True

Error:
FileNotFoundError: [Errno 2] No such file or directory: '/checkpoint/wnhsu/experiments/hubert/s2st/vp/pt/hubert_base_librispeech__+data-vp_en_es_fr_it2_from_400k_L6_km500_optimization.max_update-400000/checkpoints/checkpoint_last.pt'

Where can I find the checkpoint_last.pt file? Am I missing smth here?

Thanks in advance.

gmryu commented 2 years ago

Have you downloaded https://github.com/facebookresearch/fairseq/blob/main/examples/speech_to_speech/docs/textless_s2st_real_data.md#hubert ?

Then you have to decide a DATA_DIR. So is it /checkpoint/wnhsu/experiments/hubert/s2st/vp/pt/hubert_base_librispeech__+data-vp_en_es_fr_it2_from_400k_L6_km500_optimization.max_update-400000/checkpoints/ ? make sure you can cd $DATA_DIR. Copy the downloaded checkpoint_last.pt or checkpoint_best.pt there and ls $DATA_DIR to see it is there.

Then it should be fine.

dina-adel commented 2 years ago

@gmryu I specified a folder as my DATA_DIR and added checkpoint_best.pt, but I still get the same error & I can't figure out where this file name is specified.

I attached a screenshot of the whole message.

gmryu commented 2 years ago

From that error log, it is the downloaded hubert checkpoint had a cfg.w2v_path saved inside.

I believe it is a fairseq implementation problem. Using both path and w2v_path to dertermine model.pt's whereabout is confusing.

May you try adding this command line arguments: common_eval.model_overrides="{w2v_path: ${PATH_TO__YOUR_MODEL} }" If you tried, please tell me the result.

Or by the same meaning, may you edit examples/hubert/config/decode/infer_viterbi.yaml, like this:

common_eval:
  results_path: ???
  model_overrides: {"w2v_path":  _manually input your path to model here_  }

There is a whitespace between : and {, very important to distinguish it is a string or an attribute. You can just copy this yaml and put it with your model, use --config-dir {the directory of that yaml} --config-name {yaml file name, no need .yaml} instead will make the program run with the given yaml.

dina-adel commented 2 years ago

@gmryu I added w2v_path to the overrides, then added the path again in the hubert_asr.py file to override it. Now, I don't get an error message, however, the code keeps running until it crashes. When I debugged the code, I found that it enters a loop when calling the build_model function, then functions keep calling each other resulting in the state being loaded many times that the memory is filled and the program crashes.

I don't know if my changes caused this, but I don't think so.

gmryu commented 2 years ago

I have no knowledge of hubert nor of w2v.

About that looping build_model, I believe you need both HubertBaseModel and HubertEncoder (is it this?) So actually path refers to the HubertEncoder while w2v_path refers to the base model. fairseq usually extends base model classes to other classes.

I assume from hubert , you get the base model. and from vocoder , you get the HubertEncoder?

Hope this concludes the problem.

That added path in hubert_asr.py is necessary, right? just adding command line arguments did not make arguments being overwritten.