ggerganov / llama.cpp

LLM inference in C/C++
MIT License
65.87k stars 9.46k forks source link

Bug: server crashed today for the first time. #7637

Closed 0wwafa closed 2 months ago

0wwafa commented 4 months ago

What happened?

I created an assistant, it's instructed to output a json object at the end of the chat. The chat went on perfectly but at the very end the server crashed.

Name and Version

[built just a few moments ago] \bin\main --version version: 3029 (b864b50c) built with clang version 18.1.5 for x86_64-w64-windows-gnu

Note: I am using the same prompt as usual, and this never happened before.

What operating system are you seeing the problem on?

Windows

Relevant log output

{"tid":"14964","timestamp":1717067190,"level":"INFO","function":"init","line":715,"msg":"initializing slots","n_slots":1}
{"tid":"14964","timestamp":1717067190,"level":"INFO","function":"init","line":727,"msg":"new slot","id_slot":0,"n_ctx_slot":1536}
{"tid":"14964","timestamp":1717067190,"level":"INFO","function":"main","line":3040,"msg":"model loaded"}
{"tid":"14964","timestamp":1717067190,"level":"INFO","function":"main","line":3065,"msg":"chat template","chat_example":"[INST] You are a helpful assistant\nHello [/INST]Hi there</s>[INST] How are you? [/INST]","built_in":false}
{"tid":"14964","timestamp":1717067190,"level":"INFO","function":"main","line":3793,"msg":"HTTP server listening","port":"8080","n_threads_http":"7","hostname":"127.0.0.1"}
{"tid":"14964","timestamp":1717067285,"level":"INFO","function":"update_slots","line":1812,"msg":"all slots are idle"}
{"tid":"14964","timestamp":1717067285,"level":"INFO","function":"launch_slot_with_task","line":1046,"msg":"slot is processing task","id_slot":0,"id_task":0}
{"tid":"14964","timestamp":1717067285,"level":"INFO","function":"update_slots","line":2095,"msg":"kv cache rm [p0, end)","id_slot":0,"id_task":0,"p0":969}
{"tid":"14964","timestamp":1717067311,"level":"INFO","function":"print_timings","line":321,"msg":"prompt eval time     =   26064.67 ms /   248 tokens (  105.10 ms per token,     9.51 tokens per second)","id_slot":0,"id_task":0,"t_prompt_processing":26064.675,"n_prompt_tokens_processed":248,"t_token":105.09949596774193,"n_tokens_second":9.51479348965602}
{"tid":"14964","timestamp":1717067311,"level":"INFO","function":"print_timings","line":337,"msg":"generation eval time =     474.11 ms /     4 runs   (  118.53 ms per token,     8.44 tokens per second)","id_slot":0,"id_task":0,"t_token_generation":474.11,"n_decoded":4,"t_token":118.5275,"n_tokens_second":8.43686064415431}
{"tid":"14964","timestamp":1717067311,"level":"INFO","function":"print_timings","line":347,"msg":"          total time =   26538.78 ms","id_slot":0,"id_task":0,"t_prompt_processing":26064.675,"t_token_generation":474.11,"t_total":26538.785}
{"tid":"14964","timestamp":1717067311,"level":"INFO","function":"update_slots","line":1794,"msg":"slot released","id_slot":0,"id_task":0,"n_ctx":1536,"n_past":251,"n_system_tokens":969,"n_cache_tokens":251,"truncated":false}
{"tid":"18004","timestamp":1717067311,"level":"INFO","function":"log_server_request","line":2894,"msg":"request","remote_addr":"127.0.0.1","remote_port":5665,"status":200,"method":"POST","path":"/completion","params":{}}
{"tid":"14964","timestamp":1717067311,"level":"INFO","function":"update_slots","line":1812,"msg":"all slots are idle"}
{"tid":"14964","timestamp":1717067311,"level":"INFO","function":"update_slots","line":1812,"msg":"all slots are idle"}
{"tid":"14964","timestamp":1717067333,"level":"INFO","function":"launch_slot_with_task","line":1046,"msg":"slot is processing task","id_slot":0,"id_task":6}
{"tid":"14964","timestamp":1717067333,"level":"INFO","function":"update_slots","line":2095,"msg":"kv cache rm [p0, end)","id_slot":0,"id_task":6,"p0":1220}
{"tid":"14964","timestamp":1717067336,"level":"INFO","function":"print_timings","line":321,"msg":"prompt eval time     =    1078.73 ms /    11 tokens (   98.07 ms per token,    10.20 tokens per second)","id_slot":0,"id_task":6,"t_prompt_processing":1078.73,"n_prompt_tokens_processed":11,"t_token":98.06636363636363,"n_tokens_second":10.197176309178385}
{"tid":"14964","timestamp":1717067336,"level":"INFO","function":"print_timings","line":337,"msg":"generation eval time =    1458.90 ms /    11 runs   (  132.63 ms per token,     7.54 tokens per second)","id_slot":0,"id_task":6,"t_token_generation":1458.9,"n_decoded":11,"t_token":132.62727272727273,"n_tokens_second":7.539927342518336}
{"tid":"14964","timestamp":1717067336,"level":"INFO","function":"print_timings","line":347,"msg":"          total time =    2537.63 ms","id_slot":0,"id_task":6,"t_prompt_processing":1078.73,"t_token_generation":1458.9,"t_total":2537.63}
{"tid":"14964","timestamp":1717067336,"level":"INFO","function":"update_slots","line":1794,"msg":"slot released","id_slot":0,"id_task":6,"n_ctx":1536,"n_past":272,"n_system_tokens":969,"n_cache_tokens":272,"truncated":false}
{"tid":"12284","timestamp":1717067336,"level":"INFO","function":"log_server_request","line":2894,"msg":"request","remote_addr":"127.0.0.1","remote_port":5786,"status":200,"method":"POST","path":"/completion","params":{}}
{"tid":"14964","timestamp":1717067336,"level":"INFO","function":"update_slots","line":1812,"msg":"all slots are idle"}
{"tid":"14964","timestamp":1717067336,"level":"INFO","function":"update_slots","line":1812,"msg":"all slots are idle"}
{"tid":"14964","timestamp":1717067380,"level":"INFO","function":"launch_slot_with_task","line":1046,"msg":"slot is processing task","id_slot":0,"id_task":19}
{"tid":"14964","timestamp":1717067380,"level":"INFO","function":"update_slots","line":2095,"msg":"kv cache rm [p0, end)","id_slot":0,"id_task":19,"p0":1241}
{"tid":"14964","timestamp":1717067386,"level":"INFO","function":"print_timings","line":321,"msg":"prompt eval time     =    2684.39 ms /    27 tokens (   99.42 ms per token,    10.06 tokens per second)","id_slot":0,"id_task":19,"t_prompt_processing":2684.386,"n_prompt_tokens_processed":27,"t_token":99.4217037037037,"n_tokens_second":10.058166001461787}
{"tid":"14964","timestamp":1717067386,"level":"INFO","function":"print_timings","line":337,"msg":"generation eval time =    3559.27 ms /    25 runs   (  142.37 ms per token,     7.02 tokens per second)","id_slot":0,"id_task":19,"t_token_generation":3559.272,"n_decoded":25,"t_token":142.37088,"n_tokens_second":7.023908259891348}
{"tid":"14964","timestamp":1717067386,"level":"INFO","function":"print_timings","line":347,"msg":"          total time =    6243.66 ms","id_slot":0,"id_task":19,"t_prompt_processing":2684.386,"t_token_generation":3559.272,"t_total":6243.657999999999}
{"tid":"14964","timestamp":1717067386,"level":"INFO","function":"update_slots","line":1794,"msg":"slot released","id_slot":0,"id_task":19,"n_ctx":1536,"n_past":323,"n_system_tokens":969,"n_cache_tokens":323,"truncated":false}
{"tid":"15788","timestamp":1717067386,"level":"INFO","function":"log_server_request","line":2894,"msg":"request","remote_addr":"127.0.0.1","remote_port":5816,"status":200,"method":"POST","path":"/completion","params":{}}
{"tid":"14964","timestamp":1717067386,"level":"INFO","function":"update_slots","line":1812,"msg":"all slots are idle"}
{"tid":"14964","timestamp":1717067386,"level":"INFO","function":"update_slots","line":1812,"msg":"all slots are idle"}
{"tid":"14964","timestamp":1717067430,"level":"INFO","function":"launch_slot_with_task","line":1046,"msg":"slot is processing task","id_slot":0,"id_task":46}
{"tid":"14964","timestamp":1717067430,"level":"INFO","function":"update_slots","line":2095,"msg":"kv cache rm [p0, end)","id_slot":0,"id_task":46,"p0":1289}
{"tid":"14964","timestamp":1717067441,"level":"INFO","function":"print_timings","line":321,"msg":"prompt eval time     =    3290.51 ms /    33 tokens (   99.71 ms per token,    10.03 tokens per second)","id_slot":0,"id_task":46,"t_prompt_processing":3290.515,"n_prompt_tokens_processed":33,"t_token":99.71257575757575,"n_tokens_second":10.028825275070924}
{"tid":"14964","timestamp":1717067441,"level":"INFO","function":"print_timings","line":337,"msg":"generation eval time =    7062.22 ms /    48 runs   (  147.13 ms per token,     6.80 tokens per second)","id_slot":0,"id_task":46,"t_token_generation":7062.223,"n_decoded":48,"t_token":147.12964583333334,"n_tokens_second":6.796726753035128}
{"tid":"14964","timestamp":1717067441,"level":"INFO","function":"print_timings","line":347,"msg":"          total time =   10352.74 ms","id_slot":0,"id_task":46,"t_prompt_processing":3290.515,"t_token_generation":7062.223,"t_total":10352.738}
{"tid":"14964","timestamp":1717067441,"level":"INFO","function":"update_slots","line":1794,"msg":"slot released","id_slot":0,"id_task":46,"n_ctx":1536,"n_past":400,"n_system_tokens":969,"n_cache_tokens":400,"truncated":false}
{"tid":"21380","timestamp":1717067441,"level":"INFO","function":"log_server_request","line":2894,"msg":"request","remote_addr":"127.0.0.1","remote_port":5829,"status":200,"method":"POST","path":"/completion","params":{}}
{"tid":"14964","timestamp":1717067441,"level":"INFO","function":"update_slots","line":1812,"msg":"all slots are idle"}
{"tid":"14964","timestamp":1717067441,"level":"INFO","function":"update_slots","line":1812,"msg":"all slots are idle"}
{"tid":"14964","timestamp":1717067454,"level":"INFO","function":"launch_slot_with_task","line":1046,"msg":"slot is processing task","id_slot":0,"id_task":96}
{"tid":"14964","timestamp":1717067454,"level":"INFO","function":"update_slots","line":2095,"msg":"kv cache rm [p0, end)","id_slot":0,"id_task":96,"p0":1366}
{"tid":"14964","timestamp":1717067468,"level":"INFO","function":"print_timings","line":321,"msg":"prompt eval time     =    1802.17 ms /    18 tokens (  100.12 ms per token,     9.99 tokens per second)","id_slot":0,"id_task":96,"t_prompt_processing":1802.172,"n_prompt_tokens_processed":18,"t_token":100.12066666666666,"n_tokens_second":9.98794787622935}
{"tid":"14964","timestamp":1717067468,"level":"INFO","function":"print_timings","line":337,"msg":"generation eval time =   12310.86 ms /    85 runs   (  144.83 ms per token,     6.90 tokens per second)","id_slot":0,"id_task":96,"t_token_generation":12310.861,"n_decoded":85,"t_token":144.83365882352942,"n_tokens_second":6.904472400427557}
{"tid":"14964","timestamp":1717067468,"level":"INFO","function":"print_timings","line":347,"msg":"          total time =   14113.03 ms","id_slot":0,"id_task":96,"t_prompt_processing":1802.172,"t_token_generation":12310.861,"t_total":14113.033000000001}
{"tid":"14964","timestamp":1717067468,"level":"INFO","function":"update_slots","line":1794,"msg":"slot released","id_slot":0,"id_task":96,"n_ctx":1536,"n_past":499,"n_system_tokens":969,"n_cache_tokens":499,"truncated":false}
{"tid":"19820","timestamp":1717067468,"level":"INFO","function":"log_server_request","line":2894,"msg":"request","remote_addr":"127.0.0.1","remote_port":5832,"status":200,"method":"POST","path":"/completion","params":{}}
{"tid":"14964","timestamp":1717067468,"level":"INFO","function":"update_slots","line":1812,"msg":"all slots are idle"}
{"tid":"14964","timestamp":1717067468,"level":"INFO","function":"update_slots","line":1812,"msg":"all slots are idle"}
{"tid":"14964","timestamp":1717067513,"level":"INFO","function":"launch_slot_with_task","line":1046,"msg":"slot is processing task","id_slot":0,"id_task":183}
{"tid":"14964","timestamp":1717067513,"level":"INFO","function":"update_slots","line":2095,"msg":"kv cache rm [p0, end)","id_slot":0,"id_task":183,"p0":1465}
{"tid":"14964","timestamp":1717067523,"level":"INFO","function":"update_slots","line":1851,"msg":"slot context shift","id_slot":0,"id_task":183,"n_keep":1,"n_left":1534,"n_discard":767,"n_ctx":1536,"n_past":566,"n_system_tokens":969,"n_cache_tokens":566}
libc++abi: terminating due to uncaught exception of type std::length_error: vector
0wwafa commented 4 months ago

I thought the problem might have been the json objects the assistant outputs that confuses the parser in the server, but after some more testing, I noticed it crashes also mid conversation after a while (I'm using it on cpu only and the cpu is stable).

github-actions[bot] commented 2 months ago

This issue was closed because it has been inactive for 14 days since being marked as stale.