Mozer / talk-llama-fast

Port of OpenAI's Whisper model in C/C++ with xtts and wav2lip
MIT License
708 stars 64 forks source link

error: unknown argument: --vad-start-thold #2

Closed alrostami closed 4 months ago

alrostami commented 4 months ago

I tried compiling your code. I was able to generate talk-llama, downloaded your script talk-llama.bat, but when I run it I get the following error continued by instructions on how to use talk-llama :

error: unknown argument: --vad-start-thold

usage: ./talk-llama [options]

options:
  -h,       --help           [default] show this help message and exit
  -t N,     --threads N      [4      ] number of threads to use during computation
  -vms N,   --voice-ms N     [10000  ] voice duration in milliseconds
  -c ID,    --capture ID     [-1     ] capture device ID
  -mt N,    --max-tokens N   [32     ] maximum number of tokens per audio chunk
  -ac N,    --audio-ctx N    [0      ] audio context size (0 - all)
  -ngl N,   --n-gpu-layers N [999    ] number of layers to store in VRAM
  -vth N,   --vad-thold N    [0.60   ] voice activity detection threshold
  -vlm N,   --vad-last-ms N  [500    ] vad min silence after speech, ms
  -fth N,   --freq-thold N   [100.00 ] high-pass frequency cutoff
  -su,      --speed-up       [false  ] speed up audio by x2 (reduced accuracy)
  -tr,      --translate      [false  ] translate from source language to english
  -ps,      --print-special  [false  ] print special tokens
  -pe,      --print-energy   [false  ] print sound energy (for debugging)
  -vp,      --verbose-prompt [false  ] print prompt at start
  -ng,      --no-gpu         [false  ] disable GPU
  -p NAME,  --person NAME    [Alex   ] person name (for prompt selection)
  -bn NAME, --bot-name NAME  [LLaMA  ] bot name (to display)
  -w TEXT,  --wake-command T [       ] wake-up command to listen for
  -ho TEXT, --heard-ok TEXT  [       ] said by TTS before generating reply
  -l LANG,  --language LANG  [en     ] spoken language
  -mw FILE, --model-whisper  [./ggml-medium.en-q5_0.bin] whisper model file
  -ml FILE, --model-llama    [./mistral-7b-instruct-v0.2.Q6_K.gguf] llama model file
  -s FILE,  --speak TEXT     [speak  ] command for TTS
  --prompt-file FNAME        [       ] file with custom prompt to start dialog
  --session FNAME                   file to cache model state in (may be large!) (default: none)
  -f FNAME, --file FNAME     [       ] text output file name
   --ctx_size N              [2048   ] Size of the prompt context
  -n N, --n_predict N        [64     ] Number of tokens to predict
  --temp N                   [0.90   ] Temperature 
  --top_k N                  [40.00  ] top_k 
  --top_p N                  [1.00   ] top_p 
  --repeat_penalty N         [1.10   ] repeat_penalty 
  --xtts-voice NAME          [emma_1 ] xtts voice without .wav
  --xtts-url TEXT            [http://localhost:8020/] xtts/silero server URL, with trailing slash
  --xtts-control-path FNAME  [./talk-llama-fast/xtts/xtts_play_allowed.txt] path to xtts_play_allowed.txt  --google-url TEXT          [http://localhost:8003/] langchain google-serper server URL, with /

Have you used a branch that you haven't pushed yet to build the demo version? If not, can you tell me what is it that I am missing?

Mozer commented 4 months ago

oops, i forgot to upload recent code changes. Now fixed. Just git pull or manually download https://github.com/Mozer/talk-llama-fast/blob/master/examples/talk-llama/talk-llama.cpp

alrostami commented 4 months ago

Thanks for the quick reply, but both of the commits you have pushed show zero lines change since 5db57b9

Mozer commented 4 months ago

There is clearly --vad-start-thold now in https://github.com/Mozer/talk-llama-fast/blob/master/examples/talk-llama/talk-llama.cpp Before that it wasn't there. Maybe some weird cache you have. Anyway you can just remove that param from bat/shell file.

alrostami commented 4 months ago

I can see them all now. Thanks!