PABannier / bark.cpp

Suno AI's Bark model in C/C++ for fast text-to-speech
MIT License
630 stars 48 forks source link

core dump when prompt contains underscore #145

Open jape42 opened 2 months ago

jape42 commented 2 months ago
user@steambox:~/ai/bark.cpp$ ./build/examples/main/main -m ./ggml_weights/bark_weights-f16.bin -em ./ggml_weights/encodec_weights-f16.bin -t 16 -p "I _LOVE_ your shirt!" -o message2.wav
    __               __                          
   / /_  ____ ______/ /__        _________  ____ 
  / __ \/ __ `/ ___/ //_/       / ___/ __ \/ __ \
 / /_/ / /_/ / /  / ,<    _    / /__/ /_/ / /_/ /
/_.___/\__,_/_/  /_/|_|  (_)   \___/ .___/ .___/ 
                                  /_/   /_/      

bark_tokenize_input: prompt: 'I _LOVE_ your shirt!'
bark_tokenize_input: number of tokens in prompt = 513, first 8 tokens: 10194 10216 62782 10216 30490 91098 10154 129595 

Generating semantic tokens: [=============================================>     ] (90%)

bark_print_statistics:   sample time =    51.76 ms / 696 tokens
bark_print_statistics:  predict time = 21475.58 ms / 30.85 ms per token
bark_print_statistics:    total time = 21543.86 ms

Generating coarse tokens: [==================================================>] (100%)

bark_print_statistics:   sample time =    21.47 ms / 2088 tokens
bark_print_statistics:  predict time = 137382.38 ms / 65.80 ms per token
bark_print_statistics:    total time = 137432.67 ms

Generating fine tokens: [==================================================>] (100%)free(): invalid next size (normal)
Aborted (core dumped)
user@steambox:~/ai/bark.cpp$ git show
commit d22ad710e5c6ea62595877e5635e30f9c40442bc (HEAD -> main, origin/main, origin/HEAD)

Core dump does not occur with prompt: "I LOVE your shirt!"

siraben commented 2 months ago

I also get this problem

[nix-shell:~/bark.cpp]$ ./examples/main/main -m ./ggml_weights/ggml_weights.bin -em ./ggml_weights/encodec_weights.bin -t 16 -p "Hello, my name is Suno. And, uh — and I like pizza. [laughs] But I also have other interests such as playing tic tac toe."
    __               __                          
   / /_  ____ ______/ /__        _________  ____ 
  / __ \/ __ `/ ___/ //_/       / ___/ __ \/ __ \
 / /_/ / /_/ / /  / ,<    _    / /__/ /_/ / /_/ /
/_.___/\__,_/_/  /_/|_|  (_)   \___/ .___/ .___/ 
                                  /_/   /_/      
ggml_init_cublas: found 1 CUDA devices:
  Device 0: NVIDIA GeForce RTX 4090, compute capability 8.9

bark_tokenize_input: prompt: 'Hello, my name is Suno. And, uh — and I like pizza. [laughs] But I also have other interests such as playing tic tac toe.'
bark_tokenize_input: number of tokens in prompt = 513, first 8 tokens: 41226 10165 25175 21372 20172 24015 20181 10167 

Generating semantic tokens: [=============================================>     ] (90%)

bark_print_statistics:   sample time =    54.50 ms / 696 tokens
bark_print_statistics:  predict time = 21012.48 ms / 30.19 ms per token
bark_print_statistics:    total time = 21100.10 ms

Generating coarse tokens: [==================================================>] (100%)

bark_print_statistics:   sample time =    23.67 ms / 2088 tokens
bark_print_statistics:  predict time = 90090.52 ms / 43.15 ms per token
bark_print_statistics:    total time = 90150.62 ms

Generating fine tokens: [==================================================>] (100%)free(): invalid next size (normal)
PABannier commented 2 months ago

Interesting! I was able to replicate the error even with #151 merged. I'll try to have a look in the next few days.