mudler / LocalAI

:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. Features: Generate Text, Audio, Video, Images, Voice Cloning, Distributed inference
https://localai.io
MIT License
22.71k stars 1.73k forks source link

Inference crashes when context is bigger than ~500 Tokens #868

Closed emakkus closed 1 year ago

emakkus commented 1 year ago

LocalAI version:

1.23.2-cuda-12

Environment, CPU architecture, OS, and Version:

Linux lxdocker 6.2.16-4-pve #1 SMP PREEMPT_DYNAMIC PVE 6.2.16-5 (2023-07-14T17:53Z) x86_64 x86_64 x86_64 GNU/Linux Proxmox lxc with Docker running inside it.

Describe the bug Even though the model is configured to accept a context size of well beyond 512 Tokens, the inference still crashes if the prompt exceeds the limit of about 500 Tokens.

This is a problem for Chat conversations, since this limit is reached quite fast when older messages are added to the prompt for conversation context awareness of the AI.

To Reproduce

Simply try to call the chat/completion endpoint of a llama2 model with a prompt that has more than about 500 Tokens.

In my case, I can reproduce the Error with the following cURL:

curl --location 'http://lxdocker:8080/v1/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
     "model": "newhope-llama2",
     "messages": [
         {
             "role": "user",
             "content": "Please write me a C# hello world app that can output the users name."
         },
         {
             "role": "assistant",
             "content": "The first code snippet is a C# class that uses several namespaces and defines an interface for interacting with the database. The `UserService` class has a constructor that takes an instance of `ApplicationDbContext`, which is used to interact with the database. It also contains two methods, `GetAllUsers` and `GetUserById`. The `GetAllUsers` method returns a list of all users in the database asynchronously using `_db.Users.ToListAsync()`. This means it will retrieve all users from the database and return them as a list. The `GetUserById` method retrieves a user by their ID, also asynchronously, using `_db.Users.FindAsync(id)`. It returns the first user found with that id or null if no such user exists. The second code snippet is an interface for interacting with users in the database. The `IUserService` interface has two methods: `GetAllUsers`, which should return a list of all users asynchronously, and `GetUser`, which should return a specific user by their ID also asynchronously. The `GetAllUsers` method is expected to retrieve all users from the database and return them in a list. The `GetUser` method is supposed to find a user with the given id and return it. These code snippets are used for creating services that interact with a database, specifically for retrieving data. They'\''re designed to be used in an application where you have a model (in this case, `User`) that you want to retrieve from your database. The interface allows you to define what the service should do without knowing how it does it, and the class implements that interface with specifics of how to interact with the database."
         },
         {
             "role": "user",
             "content": "Okay. Can you write me in python an endpoint that returns a poller?"
         },
         {
             "role": "assistant",
             "content": "In this example, `IUserService` is an interface that has two methods: `get_all_users` and `get_user`. The `get_all_users` method should return a list of all users asynchronously, and the `get_user` method should return a specific user by their ID also asynchronously. The `User` model is used to interact with the database. It has an id column which is the primary key and a name column. The `get_all_users` endpoint returns all users in the database asynchronously using session.query(User).all(). This code creates an `IUserService` interface with two methods: `GetAllUsers` and `GetUser`. Also how"
         },
         {
             "role": "user",
             "content": "Please write me a C# method that uses python classes"
         }
     ],
     "temperature": 0.7,
     "stream": false
}'

Using this hacked together curl will result in the error being thrown. Removing just one word however will make this cURL work.

Expected behavior

The limit should be the one set in the model configuration. If I set the context size to 4096, I would expect to be able to use bigger prompts

Logs

12:09PM DBG Request received: 
12:09PM DBG Configuration read: &{PredictionOptions:{Model:newhope.ggmlv3.q5_K_S.bin Language: N:0 TopP:0.7 TopK:80 Temperature:0.7 Maxtokens:0 Echo:false Batch:4096 F16:true IgnoreEOS:false RepeatPenalty:0 Keep:20 MirostatETA:0 MirostatTAU:0 Mirostat:0 FrequencyPenalty:0 TFZ:0 TypicalP:0 Seed:0 NegativePrompt: RopeFreqBase:10000 RopeFreqScale:0.5 NegativePromptScale:0} Name:newhope-llama2 StopWords:[] Cutstrings:[] TrimSpace:[] ContextSize:4096 F16:true NUMA:false Threads:8 Debug:true Roles:map[] Embeddings:false Backend: TemplateConfig:{Chat:chat ChatMessage: Completion:completion Edit: Functions:} MirostatETA:0 MirostatTAU:0 Mirostat:0 NGPULayers:70 MMap:true MMlock:false LowVRAM:false TensorSplit: MainGPU: ImageGenerationAssets: PromptCachePath: PromptCacheAll:false PromptCacheRO:false Grammar: PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: FunctionsConfig:{DisableNoAction:false NoActionFunctionName: NoActionDescriptionName:} SystemPrompt: RMSNormEps:0 NGQA:0}
12:09PM DBG Parameters: &{PredictionOptions:{Model:newhope.ggmlv3.q5_K_S.bin Language: N:0 TopP:0.7 TopK:80 Temperature:0.7 Maxtokens:0 Echo:false Batch:4096 F16:true IgnoreEOS:false RepeatPenalty:0 Keep:20 MirostatETA:0 MirostatTAU:0 Mirostat:0 FrequencyPenalty:0 TFZ:0 TypicalP:0 Seed:0 NegativePrompt: RopeFreqBase:10000 RopeFreqScale:0.5 NegativePromptScale:0} Name:newhope-llama2 StopWords:[] Cutstrings:[] TrimSpace:[] ContextSize:4096 F16:true NUMA:false Threads:8 Debug:true Roles:map[] Embeddings:false Backend: TemplateConfig:{Chat:chat ChatMessage: Completion:completion Edit: Functions:} MirostatETA:0 MirostatTAU:0 Mirostat:0 NGPULayers:70 MMap:true MMlock:false LowVRAM:false TensorSplit: MainGPU: ImageGenerationAssets: PromptCachePath: PromptCacheAll:false PromptCacheRO:false Grammar: PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: FunctionsConfig:{DisableNoAction:false NoActionFunctionName: NoActionDescriptionName:} SystemPrompt: RMSNormEps:0 NGQA:0}
12:09PM DBG Prompt (before templating): Please write me a C# hello world app that can output the users name.
The first code snippet is a C# class that uses several namespaces and defines an interface for interacting with the database. The `UserService` class has a constructor that takes an instance of `ApplicationDbContext`, which is used to interact with the database. It also contains two methods, `GetAllUsers` and `GetUserById`. The `GetAllUsers` method returns a list of all users in the database asynchronously using `_db.Users.ToListAsync()`. This means it will retrieve all users from the database and return them as a list. The `GetUserById` method retrieves a user by their ID, also asynchronously, using `_db.Users.FindAsync(id)`. It returns the first user found with that id or null if no such user exists. The second code snippet is an interface for interacting with users in the database. The `IUserService` interface has two methods: `GetAllUsers`, which should return a list of all users asynchronously, and `GetUser`, which should return a specific user by their ID also asynchronously. The `GetAllUsers` method is expected to retrieve all users from the database and return them in a list. The `GetUser` method is supposed to find a user with the given id and return it. These code snippets are used for creating services that interact with a database, specifically for retrieving data. They're designed to be used in an application where you have a model (in this case, `User`) that you want to retrieve from your database. The interface allows you to define what the service should do without knowing how it does it, and the class implements that interface with specifics of how to interact with the database.
Okay. Can you write me in python an endpoint that returns a poller?
In this example, `IUserService` is an interface that has two methods: `get_all_users` and `get_user`. The `get_all_users` method should return a list of all users asynchronously, and the `get_user` method should return a specific user by their ID also asynchronously. The `User` model is used to interact with the database. It has an id column which is the primary key and a name column. The `get_all_users` endpoint returns all users in the database asynchronously using session.query(User).all(). This code creates an `IUserService` interface with two methods: `GetAllUsers` and `GetUser`. Also how
Please write me a C# method that uses python classes
12:09PM DBG Template found, input modified to: Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
Please write me a C# hello world app that can output the users name.
The first code snippet is a C# class that uses several namespaces and defines an interface for interacting with the database. The `UserService` class has a constructor that takes an instance of `ApplicationDbContext`, which is used to interact with the database. It also contains two methods, `GetAllUsers` and `GetUserById`. The `GetAllUsers` method returns a list of all users in the database asynchronously using `_db.Users.ToListAsync()`. This means it will retrieve all users from the database and return them as a list. The `GetUserById` method retrieves a user by their ID, also asynchronously, using `_db.Users.FindAsync(id)`. It returns the first user found with that id or null if no such user exists. The second code snippet is an interface for interacting with users in the database. The `IUserService` interface has two methods: `GetAllUsers`, which should return a list of all users asynchronously, and `GetUser`, which should return a specific user by their ID also asynchronously. The `GetAllUsers` method is expected to retrieve all users from the database and return them in a list. The `GetUser` method is supposed to find a user with the given id and return it. These code snippets are used for creating services that interact with a database, specifically for retrieving data. They're designed to be used in an application where you have a model (in this case, `User`) that you want to retrieve from your database. The interface allows you to define what the service should do without knowing how it does it, and the class implements that interface with specifics of how to interact with the database.
Okay. Can you write me in python an endpoint that returns a poller?
In this example, `IUserService` is an interface that has two methods: `get_all_users` and `get_user`. The `get_all_users` method should return a list of all users asynchronously, and the `get_user` method should return a specific user by their ID also asynchronously. The `User` model is used to interact with the database. It has an id column which is the primary key and a name column. The `get_all_users` endpoint returns all users in the database asynchronously using session.query(User).all(). This code creates an `IUserService` interface with two methods: `GetAllUsers` and `GetUser`. Also how
Please write me a C# method that uses python classes
### Response:
12:09PM DBG Prompt (after templating): Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
Please write me a C# hello world app that can output the users name.
The first code snippet is a C# class that uses several namespaces and defines an interface for interacting with the database. The `UserService` class has a constructor that takes an instance of `ApplicationDbContext`, which is used to interact with the database. It also contains two methods, `GetAllUsers` and `GetUserById`. The `GetAllUsers` method returns a list of all users in the database asynchronously using `_db.Users.ToListAsync()`. This means it will retrieve all users from the database and return them as a list. The `GetUserById` method retrieves a user by their ID, also asynchronously, using `_db.Users.FindAsync(id)`. It returns the first user found with that id or null if no such user exists. The second code snippet is an interface for interacting with users in the database. The `IUserService` interface has two methods: `GetAllUsers`, which should return a list of all users asynchronously, and `GetUser`, which should return a specific user by their ID also asynchronously. The `GetAllUsers` method is expected to retrieve all users from the database and return them in a list. The `GetUser` method is supposed to find a user with the given id and return it. These code snippets are used for creating services that interact with a database, specifically for retrieving data. They're designed to be used in an application where you have a model (in this case, `User`) that you want to retrieve from your database. The interface allows you to define what the service should do without knowing how it does it, and the class implements that interface with specifics of how to interact with the database.
Okay. Can you write me in python an endpoint that returns a poller?
In this example, `IUserService` is an interface that has two methods: `get_all_users` and `get_user`. The `get_all_users` method should return a list of all users asynchronously, and the `get_user` method should return a specific user by their ID also asynchronously. The `User` model is used to interact with the database. It has an id column which is the primary key and a name column. The `get_all_users` endpoint returns all users in the database asynchronously using session.query(User).all(). This code creates an `IUserService` interface with two methods: `GetAllUsers` and `GetUser`. Also how
Please write me a C# method that uses python classes
### Response:
12:09PM DBG Loading model 'newhope.ggmlv3.q5_K_S.bin' greedly from all the available backends: llama, gpt4all, falcon, gptneox, bert-embeddings, falcon-ggml, gptj, gpt2, dolly, mpt, replit, starcoder, bloomz, rwkv, whisper, stablediffusion, piper, /build/extra/grpc/huggingface/huggingface.py
12:09PM DBG [llama] Attempting to load
12:09PM DBG Loading model llama from newhope.ggmlv3.q5_K_S.bin
12:09PM DBG Loading model in memory from file: /models/newhope.ggmlv3.q5_K_S.bin
12:09PM DBG Loading GRPC Model llama: {backendString:llama modelFile:newhope.ggmlv3.q5_K_S.bin threads:8 assetDir:/tmp/localai/backend_data context:0xc0000420e0 gRPCOptions:0xc00093e140 externalBackends:map[huggingface-embeddings:/build/extra/grpc/huggingface/huggingface.py]}
12:09PM DBG Loading GRPC Process: /tmp/localai/backend_data/backend-assets/grpc/llama
12:09PM DBG GRPC Service for newhope.ggmlv3.q5_K_S.bin will be running at: '127.0.0.1:38417'
12:09PM DBG GRPC Service state dir: /tmp/go-processmanager2793448083
12:09PM DBG GRPC Service Started
rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:38417: connect: connection refused"
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr 2023/08/07 12:09:18 gRPC Server listening at 127.0.0.1:38417
12:09PM DBG GRPC Service Ready
12:09PM DBG GRPC: Loading model with options: {state:{NoUnkeyedLiterals:{} DoNotCompare:[] DoNotCopy:[] atomicMessageInfo:<nil>} sizeCache:0 unknownFields:[] Model:/models/newhope.ggmlv3.q5_K_S.bin ContextSize:4096 Seed:0 NBatch:4096 F16Memory:true MLock:false MMap:true VocabOnly:false LowVRAM:false Embeddings:false NUMA:false NGPULayers:70 MainGPU: TensorSplit: Threads:8 LibrarySearchPath: RopeFreqBase:10000 RopeFreqScale:0.5 RMSNormEps:0 NGQA:0}
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr create_gpt_params: loading model /models/newhope.ggmlv3.q5_K_S.bin
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr ggml_init_cublas: found 1 CUDA devices:
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr   Device 0: NVIDIA GeForce RTX 3090, compute capability 8.6
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr llama.cpp: loading model from /models/newhope.ggmlv3.q5_K_S.bin
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr llama_model_load_internal: format     = ggjt v3 (latest)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr llama_model_load_internal: n_vocab    = 32001
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr llama_model_load_internal: n_ctx      = 4096
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr llama_model_load_internal: n_embd     = 5120
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr llama_model_load_internal: n_mult     = 6912
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr llama_model_load_internal: n_head     = 40
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr llama_model_load_internal: n_head_kv  = 40
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr llama_model_load_internal: n_layer    = 40
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr llama_model_load_internal: n_rot      = 128
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr llama_model_load_internal: n_gqa      = 1
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr llama_model_load_internal: rnorm_eps  = 5.0e-06
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr llama_model_load_internal: n_ff       = 13824
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr llama_model_load_internal: freq_base  = 10000.0
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr llama_model_load_internal: freq_scale = 0.5
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr llama_model_load_internal: ftype      = 16 (mostly Q5_K - Small)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr llama_model_load_internal: model size = 13B
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr llama_model_load_internal: ggml ctx size =    0.11 MB
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr llama_model_load_internal: using CUDA for GPU acceleration
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr llama_model_load_internal: mem required  =  753.00 MB (+ 3200.00 MB per state)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr llama_model_load_internal: allocating batch_size x (640 kB + n_ctx x 160 B) = 5120 MB VRAM for the scratch buffer
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr llama_model_load_internal: offloading 40 repeating layers to GPU
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr llama_model_load_internal: offloading non-repeating layers to GPU
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr llama_model_load_internal: offloading v cache to GPU
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr llama_model_load_internal: offloading k cache to GPU
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr llama_model_load_internal: offloaded 43/43 layers to GPU
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr llama_model_load_internal: total VRAM used: 16953 MB
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr llama_new_context_with_model: kv self size  = 3200.00 MB
12:09PM DBG [llama] Loads OK
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr llama_predict: warning: scaling RoPE frequency by 0.5 (default 1.0)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr fatal error: unexpected signal during runtime execution
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x86c237]
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr 
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime stack:
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.throw({0x9b105e?, 0x7f4acee27a60?})
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/panic.go:1047 +0x5d fp=0x7f50731ddf00 sp=0x7f50731dded0 pc=0x455dbd
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.sigpanic()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/signal_unix.go:825 +0x3e9 fp=0x7f50731ddf60 sp=0x7f50731ddf00 pc=0x46c269
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr 
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr goroutine 20 [syscall]:
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.cgocall(0x8194e0, 0xc0000a5668)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/cgocall.go:157 +0x5c fp=0xc0000a5640 sp=0xc0000a5608 pc=0x424edc
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr github.com/go-skynet/go-llama%2ecpp._Cfunc_llama_predict(0x7f5064001610, 0x7f50540010e0, 0xc000310000, 0x1)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     _cgo_gotypes.go:238 +0x4c fp=0xc0000a5668 sp=0xc0000a5640 pc=0x811c6c
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr github.com/go-skynet/go-llama%2ecpp.(*LLama).Predict.func2(0xc000250000?, 0xc0000a5860?, {0xc000310000, 0x0?, 0x42e9e7?}, 0x10?)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /build/go-llama/llama.go:233 +0x94 fp=0xc0000a56b8 sp=0xc0000a5668 pc=0x814bf4
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr github.com/go-skynet/go-llama%2ecpp.(*LLama).Predict(0xc000136618, {0xc000250000, 0x9f8}, {0xc000252000, 0x16, 0x0?})
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /build/go-llama/llama.go:233 +0x2a8 fp=0xc0000a5980 sp=0xc0000a56b8 pc=0x814888
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr github.com/go-skynet/LocalAI/pkg/grpc/llm/llama.(*LLM).Predict(0xc0000142a0, 0xc000224000)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /build/pkg/grpc/llm/llama/llama.go:170 +0x52 fp=0xc0000a59c0 sp=0xc0000a5980 pc=0x817b52
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr github.com/go-skynet/LocalAI/pkg/grpc.(*server).Predict(0x991920?, {0xc000224000?, 0x5dc306?}, 0x0?)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /build/pkg/grpc/server.go:50 +0x28 fp=0xc0000a5a10 sp=0xc0000a59c0 pc=0x818568
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr github.com/go-skynet/LocalAI/pkg/grpc/proto._Backend_Predict_Handler({0x965360?, 0xc000073d20}, {0xa490f0, 0xc000188120}, 0xc000222000, 0x0)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /build/pkg/grpc/proto/backend_grpc.pb.go:218 +0x170 fp=0xc0000a5a68 sp=0xc0000a5a10 pc=0x80edf0
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr google.golang.org/grpc.(*Server).processUnaryRPC(0xc0001b21e0, {0xa4bd78, 0xc0000f7380}, 0xc000138360, 0xc0001baa50, 0x1087a18, 0x0)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /go/pkg/mod/google.golang.org/grpc@v1.57.0/server.go:1360 +0xe23 fp=0xc0000a5e48 sp=0xc0000a5a68 pc=0x7f7cc3
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr google.golang.org/grpc.(*Server).handleStream(0xc0001b21e0, {0xa4bd78, 0xc0000f7380}, 0xc000138360, 0x0)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /go/pkg/mod/google.golang.org/grpc@v1.57.0/server.go:1737 +0xa36 fp=0xc0000a5f68 sp=0xc0000a5e48 pc=0x7fce16
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr google.golang.org/grpc.(*Server).serveStreams.func1.1()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /go/pkg/mod/google.golang.org/grpc@v1.57.0/server.go:982 +0x98 fp=0xc0000a5fe0 sp=0xc0000a5f68 pc=0x7f5698
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.goexit()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc0000a5fe8 sp=0xc0000a5fe0 pc=0x4879a1
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr created by google.golang.org/grpc.(*Server).serveStreams.func1
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /go/pkg/mod/google.golang.org/grpc@v1.57.0/server.go:980 +0x18c
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr 
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr goroutine 1 [IO wait]:
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/proc.go:381 +0xd6 fp=0xc00628fb68 sp=0xc00628fb48 pc=0x458b16
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.netpollblock(0x7f50c609b788?, 0x42456f?, 0x0?)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/netpoll.go:527 +0xf7 fp=0xc00628fba0 sp=0xc00628fb68 pc=0x451457
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr internal/poll.runtime_pollWait(0x7f509d084ef8, 0x72)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/netpoll.go:306 +0x89 fp=0xc00628fbc0 sp=0xc00628fba0 pc=0x482549
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr internal/poll.(*pollDesc).wait(0xc0000fc280?, 0x4?, 0x0)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/internal/poll/fd_poll_runtime.go:84 +0x32 fp=0xc00628fbe8 sp=0xc00628fbc0 pc=0x4f0852
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr internal/poll.(*pollDesc).waitRead(...)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/internal/poll/fd_poll_runtime.go:89
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr internal/poll.(*FD).Accept(0xc0000fc280)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/internal/poll/fd_unix.go:614 +0x2bd fp=0xc00628fc90 sp=0xc00628fbe8 pc=0x4f615d
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr net.(*netFD).accept(0xc0000fc280)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/net/fd_unix.go:172 +0x35 fp=0xc00628fd48 sp=0xc00628fc90 pc=0x607655
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr net.(*TCPListener).accept(0xc000012630)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/net/tcpsock_posix.go:148 +0x25 fp=0xc00628fd70 sp=0xc00628fd48 pc=0x61fec5
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr net.(*TCPListener).Accept(0xc000012630)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/net/tcpsock.go:297 +0x3d fp=0xc00628fda0 sp=0xc00628fd70 pc=0x61efbd
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr google.golang.org/grpc.(*Server).Serve(0xc0001b21e0, {0xa48980?, 0xc000012630})
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /go/pkg/mod/google.golang.org/grpc@v1.57.0/server.go:844 +0x475 fp=0xc00628fee8 sp=0xc00628fda0 pc=0x7f42b5
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr github.com/go-skynet/LocalAI/pkg/grpc.StartServer({0x7fff6d8dd5c1?, 0xc000024190?}, {0xa4b2f0?, 0xc0000142a0})
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /build/pkg/grpc/server.go:121 +0x125 fp=0xc00628ff50 sp=0xc00628fee8 pc=0x819025
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr main.main()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /build/cmd/grpc/llama/main.go:22 +0x85 fp=0xc00628ff80 sp=0xc00628ff50 pc=0x819185
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.main()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/proc.go:250 +0x207 fp=0xc00628ffe0 sp=0xc00628ff80 pc=0x4586e7
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.goexit()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc00628ffe8 sp=0xc00628ffe0 pc=0x4879a1
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr 
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr goroutine 2 [force gc (idle)]:
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/proc.go:381 +0xd6 fp=0xc00005cfb0 sp=0xc00005cf90 pc=0x458b16
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.goparkunlock(...)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/proc.go:387
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.forcegchelper()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/proc.go:305 +0xb0 fp=0xc00005cfe0 sp=0xc00005cfb0 pc=0x458950
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.goexit()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc00005cfe8 sp=0xc00005cfe0 pc=0x4879a1
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr created by runtime.init.6
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/proc.go:293 +0x25
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr 
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr goroutine 3 [GC sweep wait]:
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/proc.go:381 +0xd6 fp=0xc00005d780 sp=0xc00005d760 pc=0x458b16
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.goparkunlock(...)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/proc.go:387
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.bgsweep(0x0?)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/mgcsweep.go:319 +0xde fp=0xc00005d7c8 sp=0xc00005d780 pc=0x444d5e
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.gcenable.func1()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/mgc.go:178 +0x26 fp=0xc00005d7e0 sp=0xc00005d7c8 pc=0x439fc6
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.goexit()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc00005d7e8 sp=0xc00005d7e0 pc=0x4879a1
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr created by runtime.gcenable
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/mgc.go:178 +0x6b
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr 
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr goroutine 4 [GC scavenge wait]:
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.gopark(0xc00007e000?, 0xa41b68?, 0x0?, 0x0?, 0x0?)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/proc.go:381 +0xd6 fp=0xc00005df70 sp=0xc00005df50 pc=0x458b16
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.goparkunlock(...)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/proc.go:387
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.(*scavengerState).park(0x10d3c00)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/mgcscavenge.go:400 +0x53 fp=0xc00005dfa0 sp=0xc00005df70 pc=0x442c33
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.bgscavenge(0x0?)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/mgcscavenge.go:633 +0x65 fp=0xc00005dfc8 sp=0xc00005dfa0 pc=0x443225
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.gcenable.func2()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/mgc.go:179 +0x26 fp=0xc00005dfe0 sp=0xc00005dfc8 pc=0x439f66
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.goexit()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc00005dfe8 sp=0xc00005dfe0 pc=0x4879a1
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr created by runtime.gcenable
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/mgc.go:179 +0xaa
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr 
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr goroutine 5 [finalizer wait]:
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.gopark(0x1a0?, 0x10d4120?, 0x60?, 0x78?, 0xc00005c770?)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/proc.go:381 +0xd6 fp=0xc00005c628 sp=0xc00005c608 pc=0x458b16
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.runfinq()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/mfinal.go:193 +0x107 fp=0xc00005c7e0 sp=0xc00005c628 pc=0x439007
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.goexit()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc00005c7e8 sp=0xc00005c7e0 pc=0x4879a1
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr created by runtime.createfing
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/mfinal.go:163 +0x45
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr 
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr goroutine 21 [GC worker (idle)]:
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.gopark(0x789200?, 0xc000092730?, 0x80?, 0x27?, 0xc0000927d0?)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/proc.go:381 +0xd6 fp=0xc0001f2750 sp=0xc0001f2730 pc=0x458b16
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.gcBgMarkWorker()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/mgc.go:1275 +0xf1 fp=0xc0001f27e0 sp=0xc0001f2750 pc=0x43bd31
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.goexit()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc0001f27e8 sp=0xc0001f27e0 pc=0x4879a1
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr created by runtime.gcBgMarkStartWorkers
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/mgc.go:1199 +0x25
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr 
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr goroutine 23 [GC worker (idle)]:
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/proc.go:381 +0xd6 fp=0xc000058f50 sp=0xc000058f30 pc=0x458b16
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.gcBgMarkWorker()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/mgc.go:1275 +0xf1 fp=0xc000058fe0 sp=0xc000058f50 pc=0x43bd31
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.goexit()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc000058fe8 sp=0xc000058fe0 pc=0x4879a1
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr created by runtime.gcBgMarkStartWorkers
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/mgc.go:1199 +0x25
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr 
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr goroutine 50 [GC worker (idle)]:
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.gopark(0x996d05b128e3?, 0xa455e0?, 0xd0?, 0x3f?, 0xc00005f740?)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/proc.go:381 +0xd6 fp=0xc00005f750 sp=0xc00005f730 pc=0x458b16
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.gcBgMarkWorker()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/mgc.go:1275 +0xf1 fp=0xc00005f7e0 sp=0xc00005f750 pc=0x43bd31
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.goexit()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc00005f7e8 sp=0xc00005f7e0 pc=0x4879a1
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr created by runtime.gcBgMarkStartWorkers
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/mgc.go:1199 +0x25
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr 
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr goroutine 24 [GC worker (idle)]:
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/proc.go:381 +0xd6 fp=0xc000059750 sp=0xc000059730 pc=0x458b16
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.gcBgMarkWorker()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/mgc.go:1275 +0xf1 fp=0xc0000597e0 sp=0xc000059750 pc=0x43bd31
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.goexit()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc0000597e8 sp=0xc0000597e0 pc=0x4879a1
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr created by runtime.gcBgMarkStartWorkers
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/mgc.go:1199 +0x25
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr 
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr goroutine 22 [GC worker (idle)]:
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.gopark(0x0?, 0xc00005efd0?, 0x98?, 0x56?, 0xc0001b21e0?)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/proc.go:381 +0xd6 fp=0xc00005ef50 sp=0xc00005ef30 pc=0x458b16
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.gcBgMarkWorker()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/mgc.go:1275 +0xf1 fp=0xc00005efe0 sp=0xc00005ef50 pc=0x43bd31
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.goexit()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc00005efe8 sp=0xc00005efe0 pc=0x4879a1
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr created by runtime.gcBgMarkStartWorkers
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/mgc.go:1199 +0x25
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr 
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr goroutine 14 [select]:
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.gopark(0xc000299f00?, 0x2?, 0x3?, 0x40?, 0xc000299ed4?)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/proc.go:381 +0xd6 fp=0xc000187d60 sp=0xc000187d40 pc=0x458b16
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.selectgo(0xc000187f00, 0xc000299ed0, 0x62a3e9?, 0x0, 0xc000300000?, 0x1)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/select.go:327 +0x7be fp=0xc000187ea0 sp=0xc000187d60 pc=0x4686fe
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr google.golang.org/grpc/internal/transport.(*controlBuffer).get(0xc000092870, 0x1)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /go/pkg/mod/google.golang.org/grpc@v1.57.0/internal/transport/controlbuf.go:418 +0x115 fp=0xc000187f30 sp=0xc000187ea0 pc=0x769475
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr google.golang.org/grpc/internal/transport.(*loopyWriter).run(0xc00013c540)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /go/pkg/mod/google.golang.org/grpc@v1.57.0/internal/transport/controlbuf.go:552 +0x91 fp=0xc000187f90 sp=0xc000187f30 pc=0x769bf1
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr google.golang.org/grpc/internal/transport.NewServerTransport.func2()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /go/pkg/mod/google.golang.org/grpc@v1.57.0/internal/transport/http2_server.go:341 +0xda fp=0xc000187fe0 sp=0xc000187f90 pc=0x7815da
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.goexit()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc000187fe8 sp=0xc000187fe0 pc=0x4879a1
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr created by google.golang.org/grpc/internal/transport.NewServerTransport
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /go/pkg/mod/google.golang.org/grpc@v1.57.0/internal/transport/http2_server.go:338 +0x1bb3
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr 
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr goroutine 15 [select]:
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.gopark(0xc00005e770?, 0x4?, 0x9?, 0x0?, 0xc00005e6c0?)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/proc.go:381 +0xd6 fp=0xc00005e508 sp=0xc00005e4e8 pc=0x458b16
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.selectgo(0xc00005e770, 0xc00005e6b8, 0xc0001ec000?, 0x0, 0xc0001bb050?, 0x1)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/select.go:327 +0x7be fp=0xc00005e648 sp=0xc00005e508 pc=0x4686fe
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr google.golang.org/grpc/internal/transport.(*http2Server).keepalive(0xc0000f7380)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /go/pkg/mod/google.golang.org/grpc@v1.57.0/internal/transport/http2_server.go:1155 +0x233 fp=0xc00005e7c8 sp=0xc00005e648 pc=0x788cb3
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr google.golang.org/grpc/internal/transport.NewServerTransport.func4()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /go/pkg/mod/google.golang.org/grpc@v1.57.0/internal/transport/http2_server.go:344 +0x26 fp=0xc00005e7e0 sp=0xc00005e7c8 pc=0x7814c6
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.goexit()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc00005e7e8 sp=0xc00005e7e0 pc=0x4879a1
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr created by google.golang.org/grpc/internal/transport.NewServerTransport
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /go/pkg/mod/google.golang.org/grpc@v1.57.0/internal/transport/http2_server.go:344 +0x1bf8
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr 
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr goroutine 16 [IO wait]:
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.gopark(0x447ea0?, 0xb?, 0x0?, 0x0?, 0x6?)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/proc.go:381 +0xd6 fp=0xc000071aa0 sp=0xc000071a80 pc=0x458b16
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.netpollblock(0x4d5c85?, 0x42456f?, 0x0?)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/netpoll.go:527 +0xf7 fp=0xc000071ad8 sp=0xc000071aa0 pc=0x451457
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr internal/poll.runtime_pollWait(0x7f509d084e08, 0x72)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/netpoll.go:306 +0x89 fp=0xc000071af8 sp=0xc000071ad8 pc=0x482549
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr internal/poll.(*pollDesc).wait(0xc0000fc500?, 0xc000210000?, 0x0)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/internal/poll/fd_poll_runtime.go:84 +0x32 fp=0xc000071b20 sp=0xc000071af8 pc=0x4f0852
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr internal/poll.(*pollDesc).waitRead(...)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/internal/poll/fd_poll_runtime.go:89
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr internal/poll.(*FD).Read(0xc0000fc500, {0xc000210000, 0x8000, 0x8000})
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/internal/poll/fd_unix.go:167 +0x299 fp=0xc000071bb8 sp=0xc000071b20 pc=0x4f1c39
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr net.(*netFD).Read(0xc0000fc500, {0xc000210000?, 0x1060100000000?, 0x8?})
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/net/fd_posix.go:55 +0x29 fp=0xc000071c00 sp=0xc000071bb8 pc=0x6054c9
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr net.(*conn).Read(0xc0000142d8, {0xc000210000?, 0xa80?, 0x0?})
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/net/net.go:183 +0x45 fp=0xc000071c48 sp=0xc000071c00 pc=0x617005
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr net.(*TCPConn).Read(0x800010601?, {0xc000210000?, 0x0?, 0xc000071ca8?})
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     <autogenerated>:1 +0x29 fp=0xc000071c78 sp=0xc000071c48 pc=0x62a0e9
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr bufio.(*Reader).Read(0xc00003acc0, {0xc0001ec200, 0x9, 0x0?})
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/bufio/bufio.go:237 +0x1bb fp=0xc000071cb0 sp=0xc000071c78 pc=0x57bebb
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr io.ReadAtLeast({0xa453e0, 0xc00003acc0}, {0xc0001ec200, 0x9, 0x9}, 0x9)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/io/io.go:332 +0x9a fp=0xc000071cf8 sp=0xc000071cb0 pc=0x4cfbfa
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr io.ReadFull(...)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/io/io.go:351
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr golang.org/x/net/http2.readFrameHeader({0xc0001ec200?, 0x9?, 0xc00012a0c0?}, {0xa453e0?, 0xc00003acc0?})
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /go/pkg/mod/golang.org/x/net@v0.12.0/http2/frame.go:237 +0x6e fp=0xc000071d48 sp=0xc000071cf8 pc=0x754c4e
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr golang.org/x/net/http2.(*Framer).ReadFrame(0xc0001ec1c0)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /go/pkg/mod/golang.org/x/net@v0.12.0/http2/frame.go:498 +0x95 fp=0xc000071df8 sp=0xc000071d48 pc=0x755495
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr google.golang.org/grpc/internal/transport.(*http2Server).HandleStreams(0xc0000f7380, 0x0?, 0x0?)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /go/pkg/mod/google.golang.org/grpc@v1.57.0/internal/transport/http2_server.go:642 +0x167 fp=0xc000071f10 sp=0xc000071df8 pc=0x784907
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr google.golang.org/grpc.(*Server).serveStreams(0xc0001b21e0, {0xa4bd78?, 0xc0000f7380})
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /go/pkg/mod/google.golang.org/grpc@v1.57.0/server.go:969 +0x162 fp=0xc000071f80 sp=0xc000071f10 pc=0x7f53e2
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr google.golang.org/grpc.(*Server).handleRawConn.func1()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /go/pkg/mod/google.golang.org/grpc@v1.57.0/server.go:912 +0x46 fp=0xc000071fe0 sp=0xc000071f80 pc=0x7f4c86
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.goexit()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc000071fe8 sp=0xc000071fe0 pc=0x4879a1
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr created by google.golang.org/grpc.(*Server).handleRawConn
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /go/pkg/mod/google.golang.org/grpc@v1.57.0/server.go:911 +0x185
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr 
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr goroutine 35 [GC worker (idle)]:
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.gopark(0x0?, 0xc0000587d0?, 0x98?, 0x56?, 0xc0001b21e0?)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/proc.go:381 +0xd6 fp=0xc000058750 sp=0xc000058730 pc=0x458b16
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.gcBgMarkWorker()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/mgc.go:1275 +0xf1 fp=0xc0000587e0 sp=0xc000058750 pc=0x43bd31
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.goexit()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc0000587e8 sp=0xc0000587e0 pc=0x4879a1
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr created by runtime.gcBgMarkStartWorkers
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/mgc.go:1199 +0x25
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr 
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr goroutine 36 [GC worker (idle)]:
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/proc.go:381 +0xd6 fp=0xc0001f2f50 sp=0xc0001f2f30 pc=0x458b16
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.gcBgMarkWorker()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/mgc.go:1275 +0xf1 fp=0xc0001f2fe0 sp=0xc0001f2f50 pc=0x43bd31
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.goexit()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc0001f2fe8 sp=0xc0001f2fe0 pc=0x4879a1
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr created by runtime.gcBgMarkStartWorkers
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/mgc.go:1199 +0x25
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr 
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr goroutine 37 [GC worker (idle)]:
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/proc.go:381 +0xd6 fp=0xc0001f3750 sp=0xc0001f3730 pc=0x458b16
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.gcBgMarkWorker()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/mgc.go:1275 +0xf1 fp=0xc0001f37e0 sp=0xc0001f3750 pc=0x43bd31
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.goexit()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc0001f37e8 sp=0xc0001f37e0 pc=0x4879a1
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr created by runtime.gcBgMarkStartWorkers
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/mgc.go:1199 +0x25
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr 
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr goroutine 38 [GC worker (idle)]:
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/proc.go:381 +0xd6 fp=0xc0001f3f50 sp=0xc0001f3f30 pc=0x458b16
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.gcBgMarkWorker()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/mgc.go:1275 +0xf1 fp=0xc0001f3fe0 sp=0xc0001f3f50 pc=0x43bd31
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.goexit()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc0001f3fe8 sp=0xc0001f3fe0 pc=0x4879a1
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr created by runtime.gcBgMarkStartWorkers
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/mgc.go:1199 +0x25
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr 
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr goroutine 25 [GC worker (idle)]:
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/proc.go:381 +0xd6 fp=0xc000059f50 sp=0xc000059f30 pc=0x458b16
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.gcBgMarkWorker()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/mgc.go:1275 +0xf1 fp=0xc000059fe0 sp=0xc000059f50 pc=0x43bd31
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.goexit()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc000059fe8 sp=0xc000059fe0 pc=0x4879a1
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr created by runtime.gcBgMarkStartWorkers
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/mgc.go:1199 +0x25
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr 
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr goroutine 26 [GC worker (idle)]:
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.gopark(0x996d05b1308b?, 0x0?, 0x0?, 0x0?, 0x0?)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/proc.go:381 +0xd6 fp=0xc00005a750 sp=0xc00005a730 pc=0x458b16
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.gcBgMarkWorker()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/mgc.go:1275 +0xf1 fp=0xc00005a7e0 sp=0xc00005a750 pc=0x43bd31
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.goexit()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc00005a7e8 sp=0xc00005a7e0 pc=0x4879a1
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr created by runtime.gcBgMarkStartWorkers
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/mgc.go:1199 +0x25
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr 
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr goroutine 27 [GC worker (idle)]:
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.gopark(0x996d05b1482f?, 0x0?, 0x0?, 0x0?, 0x0?)
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/proc.go:381 +0xd6 fp=0xc00005af50 sp=0xc00005af30 pc=0x458b16
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.gcBgMarkWorker()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/mgc.go:1275 +0xf1 fp=0xc00005afe0 sp=0xc00005af50 pc=0x43bd31
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr runtime.goexit()
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc00005afe8 sp=0xc00005afe0 pc=0x4879a1
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr created by runtime.gcBgMarkStartWorkers
12:09PM DBG GRPC(newhope.ggmlv3.q5_K_S.bin-127.0.0.1:38417): stderr     /usr/local/go/src/runtime/mgc.go:1199 +0x25

Additional context

This is my docker-compose file I use to spin up localAi. I also overwrite the base.yaml to set additional configs:

version: '3.6'

services:
  api:
    image: quay.io/go-skynet/local-ai:v1.23.2-cublas-cuda12-ffmpeg
    #pull_policy: always
    network_mode: host
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]
#    ports:
#      - 8080:8080
    environment:
      - MODELS_PATH=/models
      - BUILD_TYPE=cublas
      - GO_TAGS=stablediffusion,tts
      - REBUILD=true
      - CONTEXT_SIZE=4096
      - THREADS=8
      - ADDRESS=0.0.0.0:8080
      - IMAGE_PATH=/tmp
      - DEBUG=true
      - UPLOAD_LIMIT=100
      - PRELOAD_MODELS=[{"url":"github:go-skynet/model-gallery/stablediffusion.yaml"},{"url":"https://raw.githubusercontent.com/go-skynet/model-gallery/main/base.yaml","name":"oasst-llama2","overrides":{"f16":true,"rope_freq_base":10000,"rope_freq_scale":0.5,"nbatch":4096,"mmlock":false,"low_vram":false,"threads":8,"mmap":true,"gpu_layers":70,"context_size":4096,"parameters":{"model":"openassistant-llama2-13b-orca-8k-3319.ggmlv3.q6_K.bin","batch":4096,"f16":true,"n_keep":20}}},{"url":"https://raw.githubusercontent.com/go-skynet/model-gallery/main/base.yaml","name":"newhope-llama2","overrides":{"f16":true,"nbatch":4096,"mmlock":false,"low_vram":false,"threads":8,"mmap":true,"gpu_layers":70,"context_size":4096,"parameters":{"model":"newhope.ggmlv3.q5_K_S.bin","rope_freq_base":10000,"rope_freq_scale":0.5,"batch":4096,"f16":true,"n_keep":20}}}]
    volumes:
      - /AI/model-configs/:/models/
    command: ["/usr/bin/local-ai"]

This is the newhope config that's being created:

context_size: 4096
f16: true
gpu_layers: 70
low_vram: false
mmap: true
mmlock: false
name: newhope-llama2
nbatch: 4096
parameters:
  batch: 4096
  f16: true
  model: newhope.ggmlv3.q5_K_S.bin
  n_keep: 20
  rope_freq_base: 10000
  rope_freq_scale: 0.5
  temperature: 0.2
  top_k: 80
  top_p: 0.7
template:
  chat: chat
  completion: completion
threads: 8

I absolutely love this project, and I hope there is a solution for this problem. If I'm missing a config option, please point it out to me. However the logs seem to indicate that the context size in the config is recognized.

Btw I'm using the following model from TheBloke: https://huggingface.co/TheBloke/NewHope-GGML

But I have tried other models, removed the rope config stuff but nothing seems to work. Once the 500 Token-Count is breached the Call crashes.

mudler commented 1 year ago

is this happening only on llama2-based models? I have been using also contexts up to 8k just fine here with a GPU. Also - can you try master? I believe there were couple of fixes wrt to Llama2 that should probably matter here

emakkus commented 1 year ago

@mudler Never mind... I think this is a problem on my end... I tried to run the same model on LLamaSharp and got the same segfault at around 500 Tokens. I'll try out other models but this pretty much proves that the problem is not on the LocalAI side. Thx anyway :)