dusty-nv / NanoLLM

Optimized local inference for LLMs with HuggingFace-like APIs for quantization, vision/language models, multimodal agents, speech, vector DB, and RAG.
https://dusty-nv.github.io/NanoLLM/
MIT License
196 stars 31 forks source link

Audio Ouput Plugin Question #52

Open khalton55 opened 2 weeks ago

khalton55 commented 2 weeks ago

Hi,

I am trying to use Piper TTS and Audio Recorder to generate a wav file based on text input. I set up the args to open './response.wav' as the audio output file. Currently, when I process one string it generates the speech however, on subsequent strings it appends to the file. Ex: string1 = response1, string2 = response1 + response2. Is there an argument or way to overwrite the response.wav instead of appending to it? Let me know if there are any more questions - I tried digging through the code but got lost :)

Code Snippets: args = ArgParser(extras=['tts', 'audio_output', 'prompt', 'log', 'voice', 'voice-speaker']).parse_args() args.tts = 'piper' args.audio_output_file = './response.wav' args.verbose = True args.voice= 'en_US-hfc_male-medium'


tts = AutoTTS.from_pretrained(**vars(args))

    if args.audio_output_file is not None:
        tts.add(AudioRecorder(**vars(args)))

    #Starts TTS service
    tts.start()

tts.process(reply)