Open ZenniQ opened 2 months ago
Hello. In my experiments, I noticed that loss values varied depending on the language and the length of the data. Our model has a limitation in that it is dependent on SpeechMatrix. Therefore, for languages that do not perform well in SpeechMatrix, the loss tends to be higher during training. Additionally, if the individual speech samples in the training data are lengthy, the performance tends to degrade as well. However, if you've complete the training, I recommend proceeding to inference and checking the BLEU score. When I trained on multiple languages, the loss values ranged from 3.8 to 4.8.
Thanks for your response. I'll dive deeper into analyzing my experiment.
Hello, I selected 600,000 English audio samples ranging from 1 to 4 seconds from Common Voice and trained for 200,000 updates. When I tested the model using CVSS data, I found that the BLEU score was very low, around 1. I haven't made any progress on this issue for quite a while. Do you think the problem lies in my training data, or is there something wrong with my training process? Could you spare some time to give me some advice? Thank you very much!
Here is my training script. MODEL_DIR=/home/zenniq/trans_proj/my_log/673600_vanilla/case3/checkpoints CUDA_VISIBLE_DEVICES=1,2 fairseq-train /home/zenniq/trans_proj/en_data_root3 \ --config-yaml /home/zenniq/trans_proj/en_data_root3/configs/config.yaml \ --task sem_to_speech --target-is-code --target-code-size 1000 --vocoder code_hifigan \ --criterion speech_to_unit --label-smoothing 0.2 \ --arch s2ut_transformer_dec --share-decoder-input-output-embed \ --dropout 0.1 --attention-dropout 0.1 --relu-dropout 0.1 \ --train-subset train2 --valid-subset valid \ --save-dir ${MODEL_DIR} \ --lr 0.0005 --lr-scheduler inverse_sqrt --warmup-init-lr 1e-7 --warmup-updates 10000 \ --optimizer adam --adam-betas "(0.9,0.98)" --clip-norm 10.0 --weight-decay 1e-6 \ --max-update 200000 --max-tokens 20000 --max-target-positions 3000 --update-freq 8 \ --seed 1 --fp16 --num-workers 8 --n-frames-per-step 4\ --validate-interval 1 --save-interval 10 \ --tensorboard-logdir /home/zenniq/trans_proj/my_log/673600_vanilla/case3/checkpoints \ --input_split True --skip-invalid-size-inputs-valid-test
Hello, I’d like to ask what the final loss value was when you were training the TranSentence model. My loss can only decrease to about 5.7. Could you let me know what might be the cause of this?