the-crypt-keeper / LlamaCpp-Horde-Bridge

Use llama.cpp server to join the AI Horde
GNU Affero General Public License v3.0
1 stars 0 forks source link

Generations sometimes fail #1

Closed the-crypt-keeper closed 9 months ago

the-crypt-keeper commented 9 months ago

From the server side, they look like this:

print_timings: prompt eval time =    6960.61 ms /  3323 tokens (    2.09 ms per token,   477.40 tokens per second)                                                                                
print_timings:        eval time = 1162042209.58 ms /     0 runs   (     inf ms per token,     0.00 tokens per second)                                                                             
print_timings:       total time = 1162049170.19 ms                                                                                                                                                
slot 0 released (3324 tokens in cache)

Haven't yet caught exactly what is happening at the request level to cause this:

INFO       | 2023-12-21 22:21:58 | __main__:bridge:159 - Job received from https://aihorde.net for 300 tokens and 4096 max context. Starting generation...                                        
INFO       | 2023-12-21 22:22:06 | __main__:bridge:159 - Job received from https://aihorde.net for 300 tokens and 4096 max context. Starting generation...                                        INFO       | 2023-12-21 22:22:14 | __main__:bridge:159 - Job received from https://aihorde.net for 300 tokens and 4096 max context. Starting generation...                                        
INFO       | 2023-12-21 22:22:23 | __main__:bridge:159 - Job received from https://aihorde.net for 300 tokens and 4096 max context. Starting generation...                                        
INFO       | 2023-12-21 22:22:31 | __main__:validate_kai:57 - llama.cpp server model=koboldcpp/openhermes-2.5-mistral-7b.Q5_K_M n_ctx=4096                                                        
INFO       | 2023-12-21 22:22:31 | __main__:bridge:159 - Job received from https://aihorde.net for 300 tokens and 4096 max context. Starting generation...                                        ERROR      | 2023-12-21 22:22:39 | __main__:bridge:85 - Exceeded retry count 4 for generation id 694a2561-e374-4a97-a55e-b1e8f5524599. Aborting generation!       
the-crypt-keeper commented 9 months ago

Caused by generations that stopped with empty content due to immediate EOS, fixed the submission under these conditions.