huggingface / speech-to-speech

Speech To Speech: an effort for an open-sourced and modular GPT4-o
Apache License 2.0
3.51k stars 365 forks source link

Repeating problem #125

Open mattfro opened 3 weeks ago

mattfro commented 3 weeks ago

Just installed this to try it out...well

Seems it goes to some kind of loop or repeating mode.

Tried to use it with server and client and run local. Both results are the same:

/speech-to-speech/.venv/lib/python3.11/site-packages/transformers/generation/configuration_utils.py:567: UserWarning: `do_sample` is set to `False`. However, `temperature` is set to `0.0` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `temperature`.
  warnings.warn(
ASSISTANT: I'm sorry for the confusion, but as an AI, I don't have the ability to see or interact with the world in the same way humans do.
ASSISTANT: I'm here to provide information and assistance within the scope of our conversation.
ASSISTANT: Again, I apologize for any confusion.
ASSISTANT: I'm here to provide information and assistance within the scope of our conversation.
ASSISTANT: I'm here to provide information and assistance within the scope of our conversation.
ASSISTANT: I'm here to provide information and assistance within the scope of our conversation.
ASSISTANT: I'm here to provide information and assistance within the scope of our

this is how I start it...maybe I have missed something? python s2s_pipeline.py --lm_model_name cognitivecomputations/dolphin-2.9.4-gemma2-2b --tts_compile_mode default --recv_host 0.0.0.0 --send_host 0.0.0.0 --language en

Tried also if it's parler problem, seems to be also on melo.

Another test with debug log on:

USER: in there 2024-10-20 22:07:56,849 - STT.whisper_stt_handler - DEBUG - Language Code Whisper: en 2024-10-20 22:07:56,849 - baseHandler - DEBUG - WhisperSTTHandler: 0.085 s 2024-10-20 22:07:56,849 - LLM.language_model - DEBUG - infering language model... /speech-to-speech/.venv/lib/python3.11/site-packages/transformers/generation/configuration_utils.py:567: UserWarning: do_sample is set to False. However, temperature is set to 0.0 -- this flag is only used in sample-based generation modes. You should set do_sample=True or unset temperature. warnings.warn( 2024-10-20 22:07:57,032 - baseHandler - DEBUG - LanguageModelHandler: 0.183 s ASSISTANT: Sure, I can help with that. 2024-10-20 22:07:57,033 - TTS.parler_handler - DEBUG - padding to 16 2024-10-20 22:07:57,442 - baseHandler - DEBUG - LanguageModelHandler: 0.410 s 2024-10-20 22:07:57,813 - baseHandler - DEBUG - LanguageModelHandler: 0.371 s 2024-10-20 22:07:57,930 - baseHandler - DEBUG - ParlerTTSHandler: 0.898 s 2024-10-20 22:07:58,234 - baseHandler - DEBUG - ParlerTTSHandler: 0.303 s ASSISTANT: What do you need assistance with? 2024-10-20 22:07:58,234 - TTS.parler_handler - DEBUG - padding to 8 2024-10-20 22:07:58,258 - baseHandler - DEBUG - LanguageModelHandler: 0.445 s 2024-10-20 22:07:58,499 - baseHandler - DEBUG - LanguageModelHandler: 0.241 s 2024-10-20 22:07:58,916 - baseHandler - DEBUG - LanguageModelHandler: 0.417 s 2024-10-20 22:07:58,928 - baseHandler - DEBUG - ParlerTTSHandler: 0.694 s 2024-10-20 22:07:59,218 - baseHandler - DEBUG - LanguageModelHandler: 0.302 s 2024-10-20 22:07:59,322 - baseHandler - DEBUG - ParlerTTSHandler: 0.394 s ASSISTANT: I'm here to help with that. 2024-10-20 22:07:59,323 - TTS.parler_handler - DEBUG - padding to 16 2024-10-20 22:07:59,607 - baseHandler - DEBUG - LanguageModelHandler: 0.389 s 2024-10-20 22:07:59,900 - baseHandler - DEBUG - LanguageModelHandler: 0.293 s 2024-10-20 22:08:00,022 - baseHandler - DEBUG - ParlerTTSHandler: 0.700 s 2024-10-20 22:08:00,243 - baseHandler - DEBUG - ParlerTTSHandler: 0.220 s ASSISTANT: What do you need assistance with?

andimarafioti commented 2 weeks ago

That's weird, and it seems like it happened repeatedly. What system are you running this on?

mattfro commented 1 week ago

That's weird, and it seems like it happened repeatedly. What system are you running this on?

I just did install it on a ubuntu system, with intel i7 12th gen cpu with 64gb ram and rtx3080(10GB)