0nutation / USLM

Unified Speech Language Model for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"(ICLR 2024)
138 stars 11 forks source link

Low Quality #4

Open YamenHabib opened 1 year ago

YamenHabib commented 1 year ago

Hi, thanks for sharing you work. I am using the following command to generate audio on the same text as in your demo using the same audio prompt. I am getting a bad audio.


python3 bin/infer.py --output-dir ${out_dir}/ \
    --model-name USLM --norm-first true --add-prenet false \
    --share-embedding true --norm-first true --add-prenet false \
    --audio-extractor "${audio_extractor}" \
    --speechtokenizer-dir "${st_dir}" \
    --checkpoint=${uslm_dir}/USLM.pt \
    --text-tokens "${uslm_dir}/unique_text_tokens.k2symbols" \
    --text-prompts "The rainbow is a division of white light into many beautiful colors." \
    --audio-prompts prompts/prompt.wav \
    --text "She also defended the lord chancellors existing powers." \

the prompt is prompt.wav file and the generated audio is gen_prombt.wav here is the audio files: https://drive.google.com/drive/folders/1QyPS3Sl87SjSOFpgBSGHKiAqoA5DW45F?usp=sharing

ght0707 commented 8 months ago

Hi, thanks for sharing you work. I am using the following command to generate audio on the same text as in your demo using the same audio prompt. I am getting a bad audio.

python3 bin/infer.py --output-dir ${out_dir}/ \
    --model-name USLM --norm-first true --add-prenet false \
    --share-embedding true --norm-first true --add-prenet false \
    --audio-extractor "${audio_extractor}" \
    --speechtokenizer-dir "${st_dir}" \
    --checkpoint=${uslm_dir}/USLM.pt \
    --text-tokens "${uslm_dir}/unique_text_tokens.k2symbols" \
    --text-prompts "The rainbow is a division of white light into many beautiful colors." \
    --audio-prompts prompts/prompt.wav \
    --text "She also defended the lord chancellors existing powers." \

the prompt is prompt.wav file and the generated audio is gen_prombt.wav here is the audio files: https://drive.google.com/drive/folders/1QyPS3Sl87SjSOFpgBSGHKiAqoA5DW45F?usp=sharing

Hello, I'd like to ask if you have retrained the USLM?