Open dina-adel opened 2 years ago
Have you downloaded https://github.com/facebookresearch/fairseq/blob/main/examples/speech_to_speech/docs/textless_s2st_real_data.md#hubert ?
Then you have to decide a DATA_DIR
. So is it /checkpoint/wnhsu/experiments/hubert/s2st/vp/pt/hubert_base_librispeech__+data-vp_en_es_fr_it2_from_400k_L6_km500_optimization.max_update-400000/checkpoints/
?
make sure you can cd $DATA_DIR
.
Copy the downloaded checkpoint_last.pt or checkpoint_best.pt there and ls $DATA_DIR
to see it is there.
Then it should be fine.
@gmryu I specified a folder as my DATA_DIR
and added checkpoint_best.pt, but I still get the same error & I can't figure out where this file name is specified.
I attached a screenshot of the whole message.
From that error log, it is the downloaded hubert checkpoint had a cfg.w2v_path
saved inside.
I believe it is a fairseq implementation problem. Using both path
and w2v_path
to dertermine model.pt's whereabout is confusing.
May you try adding this command line arguments: common_eval.model_overrides="{w2v_path: ${PATH_TO__YOUR_MODEL} }"
If you tried, please tell me the result.
Or by the same meaning, may you edit examples/hubert/config/decode/infer_viterbi.yaml
,
like this:
common_eval:
results_path: ???
model_overrides: {"w2v_path": _manually input your path to model here_ }
There is a whitespace between :
and {
, very important to distinguish it is a string or an attribute.
You can just copy this yaml and put it with your model, use --config-dir {the directory of that yaml} --config-name {yaml file name, no need .yaml}
instead will make the program run with the given yaml.
@gmryu I added w2v_path
to the overrides, then added the path again in the hubert_asr.py
file to override it. Now, I don't get an error message, however, the code keeps running until it crashes. When I debugged the code, I found that it enters a loop when calling the build_model
function, then functions keep calling each other resulting in the state being loaded many times that the memory is filled and the program crashes.
I don't know if my changes caused this, but I don't think so.
I have no knowledge of hubert nor of w2v.
About that looping build_model
, I believe you need both HubertBaseModel
and HubertEncoder
(is it this?)
So actually path refers to the HubertEncoder
while w2v_path
refers to the base model.
fairseq usually extends base model classes to other classes.
I assume from hubert , you get the base model. and from vocoder , you get the HubertEncoder?
Hope this concludes the problem.
That added path in hubert_asr.py is necessary, right? just adding command line arguments did not make arguments being overwritten.
Hello, I am trying to test the Textless S2ST model as mentioned here. However, I encountered an error while running this line:
python examples/speech_recognition/new/infer.py \ --config-dir examples/hubert/config/decode/ \ --config-name infer_viterbi \ task.data=${DATA_DIR} \ task.normalize=false \ common_eval.results_path=${RESULTS_PATH}/log \ common_eval.path=${DATA_DIR}/checkpoint_best.pt \ dataset.gen_subset=${GEN_SUBSET} \ '+task.labels=["unit"]' \ +decoding.results_path=${RESULTS_PATH} \ common_eval.post_process=none \ +dataset.batch_size=1 \ common_eval.quiet=True
Error:
FileNotFoundError: [Errno 2] No such file or directory: '/checkpoint/wnhsu/experiments/hubert/s2st/vp/pt/hubert_base_librispeech__+data-vp_en_es_fr_it2_from_400k_L6_km500_optimization.max_update-400000/checkpoints/checkpoint_last.pt'
Where can I find the checkpoint_last.pt file? Am I missing smth here?
Thanks in advance.