sozercan / aikit

🏗️ Fine-tune, build, and deploy open-source LLMs easily!
https://sozercan.github.io/aikit/
MIT License
384 stars 28 forks source link

[BUG] no galleries to load #41

Closed that1guy closed 10 months ago

that1guy commented 10 months ago

Expected Behavior

First output response should be: {"created":1701236489,"object":"chat.completion","id":"dd1ff40b-31a7-4418-9e32-42151ab6875a","model":"llama-2-7b-chat","choices":[{"index":0,"finish_reason":"stop","message":{"role":"assistant","content":"\nKubernetes is a container orchestration system that automates the deployment, scaling, and management of containerized applications in a microservices architecture."}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

Actual Behavior

Response received is: {"error":{"code":500,"message":"could not load model: rpc error: code = Unavailable desc = error reading from server: EOF","type":""}}

Steps To Reproduce

  1. Install image and start docker by running docker run -d --rm -p 9000:8080 ghcr.io/sozercan/llama2:7b (Port 9000 because 8080 is already in use)
  2. Send curl request curl http://localhost:9000/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "llama-2-7b-chat", "messages": [{"role": "user", "content": "explain kubernetes in a sentence"}] }'
  3. Observe HTTP 500 error response: {"error":{"code":500,"message":"could not load model: rpc error: code = Unavailable desc = error reading from server: EOF","type":""}}

===============

Logs upon container boot up 👍

5:37AM DBG no galleries to load
5:37AM INF Starting LocalAI using 4 threads, with models path: /models
5:37AM INF LocalAI version: v2.0.0 (238fec244ae6c9a66bc7fafd76c7e14671110a6f)
5:37AM DBG Model: llama-2-7b-chat (config: {PredictionOptions:{Model:llama-2-7b-chat.Q4_K_M.gguf Language: N:0 TopP:0.7 TopK:80 Temperature:0.2 Maxtokens:0 Echo:false Batch:0 F16:false IgnoreEOS:false RepeatPenalty:0 Keep:0 MirostatETA:0 MirostatTAU:0 Mirostat:0 FrequencyPenalty:0 TFZ:0 TypicalP:0 Seed:0 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:llama-2-7b-chat F16:false Threads:0 Debug:false Roles:map[] Embeddings:false Backend:llama TemplateConfig:{Chat: ChatMessage: Completion: Edit: Functions:} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: FunctionsConfig:{DisableNoAction:false NoActionFunctionName: NoActionDescriptionName:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0 MirostatTAU:0 Mirostat:0 NGPULayers:0 MMap:false MMlock:false LowVRAM:false Grammar: StopWords:[] Cutstrings:[] TrimSpace:[] ContextSize:4096 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: MMProj: RopeScaling: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{PipelineType: SchedulerType: CUDA:false EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} VallE:{AudioPath:}})
5:37AM DBG Extracting backend assets files to /tmp/localai/backend_data

 ┌───────────────────────────────────────────────────┐ 
 │                   Fiber v2.50.0                   │ 
 │               http://127.0.0.1:8080               │ 
 │       (bound on host 0.0.0.0 and port 8080)       │ 
 │                                                   │ 
 │ Handlers ............ 74  Processes ........... 1 │ 
 │ Prefork ....... Disabled  PID ................. 1 │ 
 └───────────────────────────────────────────────────┘ 

Logs upon incoming HTTP request:

5:40AM DBG Request received: 
5:40AM DBG Configuration read: &{PredictionOptions:{Model:llama-2-7b-chat.Q4_K_M.gguf Language: N:0 TopP:0.7 TopK:80 Temperature:0.2 Maxtokens:0 Echo:false Batch:0 F16:false IgnoreEOS:false RepeatPenalty:0 Keep:0 MirostatETA:0 MirostatTAU:0 Mirostat:0 FrequencyPenalty:0 TFZ:0 TypicalP:0 Seed:0 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:llama-2-7b-chat F16:false Threads:4 Debug:true Roles:map[] Embeddings:false Backend:llama TemplateConfig:{Chat: ChatMessage: Completion: Edit: Functions:} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: FunctionsConfig:{DisableNoAction:false NoActionFunctionName: NoActionDescriptionName:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0 MirostatTAU:0 Mirostat:0 NGPULayers:0 MMap:false MMlock:false LowVRAM:false Grammar: StopWords:[] Cutstrings:[] TrimSpace:[] ContextSize:4096 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: MMProj: RopeScaling: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{PipelineType: SchedulerType: CUDA:false EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} VallE:{AudioPath:}}
5:40AM DBG Parameters: &{PredictionOptions:{Model:llama-2-7b-chat.Q4_K_M.gguf Language: N:0 TopP:0.7 TopK:80 Temperature:0.2 Maxtokens:0 Echo:false Batch:0 F16:false IgnoreEOS:false RepeatPenalty:0 Keep:0 MirostatETA:0 MirostatTAU:0 Mirostat:0 FrequencyPenalty:0 TFZ:0 TypicalP:0 Seed:0 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:llama-2-7b-chat F16:false Threads:4 Debug:true Roles:map[] Embeddings:false Backend:llama TemplateConfig:{Chat: ChatMessage: Completion: Edit: Functions:} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: FunctionsConfig:{DisableNoAction:false NoActionFunctionName: NoActionDescriptionName:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0 MirostatTAU:0 Mirostat:0 NGPULayers:0 MMap:false MMlock:false LowVRAM:false Grammar: StopWords:[] Cutstrings:[] TrimSpace:[] ContextSize:4096 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: MMProj: RopeScaling: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{PipelineType: SchedulerType: CUDA:false EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} VallE:{AudioPath:}}
5:40AM DBG Prompt (before templating): explain kubernetes in a sentence
5:40AM DBG Template failed loading: failed loading a template for llama-2-7b-chat.Q4_K_M.gguf
5:40AM DBG Prompt (after templating): explain kubernetes in a sentence
5:40AM DBG Loading model llama from llama-2-7b-chat.Q4_K_M.gguf
5:40AM DBG Loading model in memory from file: /models/llama-2-7b-chat.Q4_K_M.gguf
5:40AM DBG Loading Model llama-2-7b-chat.Q4_K_M.gguf with gRPC (file: /models/llama-2-7b-chat.Q4_K_M.gguf) (backend: llama): {backendString:llama model:llama-2-7b-chat.Q4_K_M.gguf threads:4 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0xc0002a6780 externalBackends:map[] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false parallelRequests:false}
5:40AM DBG Loading GRPC Process: /tmp/localai/backend_data/backend-assets/grpc/llama
5:40AM DBG GRPC Service for llama-2-7b-chat.Q4_K_M.gguf will be running at: '127.0.0.1:45083'
5:40AM DBG GRPC Service state dir: /tmp/go-processmanager2300469877
5:40AM DBG GRPC Service Started
rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:45083: connect: connection refused"
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr 2023/12/15 05:40:48 gRPC Server listening at 127.0.0.1:45083
5:40AM DBG GRPC Service Ready
5:40AM DBG GRPC: Loading model with options: {state:{NoUnkeyedLiterals:{} DoNotCompare:[] DoNotCopy:[] atomicMessageInfo:<nil>} sizeCache:0 unknownFields:[] Model:llama-2-7b-chat.Q4_K_M.gguf ContextSize:4096 Seed:0 NBatch:512 F16Memory:false MLock:false MMap:false VocabOnly:false LowVRAM:false Embeddings:false NUMA:false NGPULayers:0 MainGPU: TensorSplit: Threads:4 LibrarySearchPath: RopeFreqBase:0 RopeFreqScale:0 RMSNormEps:0 NGQA:0 ModelFile:/models/llama-2-7b-chat.Q4_K_M.gguf Device: UseTriton:false ModelBaseName: UseFastTokenizer:false PipelineType: SchedulerType: CUDA:false CFGScale:0 IMG2IMG:false CLIPModel: CLIPSubfolder: CLIPSkip:0 Tokenizer: LoraBase: LoraAdapter: LoraScale:0 NoMulMatQ:false DraftModel: AudioPath: Quantization: MMProj: RopeScaling: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0}
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr create_gpt_params: loading model /models/llama-2-7b-chat.Q4_K_M.gguf
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr SIGILL: illegal instruction
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr PC=0x86853a m=0 sigcode=2
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr signal arrived during cgo execution
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr instruction bytes: 0xc4 0xe2 0x79 0x13 0xc9 0xc5 0xf2 0x59 0x15 0x3d 0x79 0x23 0x0 0xc4 0x81 0x7a
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr 
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr goroutine 34 [syscall]:
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr runtime.cgocall(0x821ae0, 0xc00014f4d8)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/cgocall.go:157 +0x4b fp=0xc00014f4b0 sp=0xc00014f478 pc=0x4176eb
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr github.com/go-skynet/go-llama%2ecpp._Cfunc_load_model(0xee9460, 0x1000, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x200, ...)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    _cgo_gotypes.go:266 +0x4f fp=0xc00014f4d8 sp=0xc00014f4b0 pc=0x8143af
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr github.com/go-skynet/go-llama%2ecpp.New({0xc000178000, 0x23}, {0xc000110240, 0x7, 0x926460?})
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /home/runner/work/LocalAI/LocalAI/sources/go-llama/llama.go:39 +0x385 fp=0xc00014f6e8 sp=0xc00014f4d8 pc=0x814da5
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr main.(*LLM).Load(0xc000012630, 0xc000148000)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /home/runner/work/LocalAI/LocalAI/backend/go/llm/llama/llama.go:87 +0xc9c fp=0xc00014f900 sp=0xc00014f6e8 pc=0x81ed1c
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr github.com/go-skynet/LocalAI/pkg/grpc.(*server).LoadModel(0xc00002ad90, {0xc000148000?, 0x50a886?}, 0x0?)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /home/runner/work/LocalAI/LocalAI/pkg/grpc/server.go:50 +0xe6 fp=0xc00014f9b0 sp=0xc00014f900 pc=0x81c566
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr github.com/go-skynet/LocalAI/pkg/grpc/proto._Backend_LoadModel_Handler({0x997880?, 0xc00002ad90}, {0xa7e610, 0xc00010e390}, 0xc000114100, 0x0)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /home/runner/work/LocalAI/LocalAI/pkg/grpc/proto/backend_grpc.pb.go:264 +0x169 fp=0xc00014fa08 sp=0xc00014f9b0 pc=0x809829
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr google.golang.org/grpc.(*Server).processUnaryRPC(0xc0001ee1e0, {0xa7e610, 0xc00010e2d0}, {0xa81b38, 0xc0001e9040}, 0xc00013e000, 0xc0001f4d20, 0xd924b0, 0x0)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /home/runner/go/pkg/mod/google.golang.org/grpc@v1.59.0/server.go:1343 +0xe03 fp=0xc00014fdf0 sp=0xc00014fa08 pc=0x7f27c3
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr google.golang.org/grpc.(*Server).handleStream(0xc0001ee1e0, {0xa81b38, 0xc0001e9040}, 0xc00013e000)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /home/runner/go/pkg/mod/google.golang.org/grpc@v1.59.0/server.go:1737 +0xc4c fp=0xc00014ff78 sp=0xc00014fdf0 pc=0x7f772c
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr google.golang.org/grpc.(*Server).serveStreams.func1.1()
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /home/runner/go/pkg/mod/google.golang.org/grpc@v1.59.0/server.go:986 +0x86 fp=0xc00014ffe0 sp=0xc00014ff78 pc=0x7f06c6
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr runtime.goexit()
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc00014ffe8 sp=0xc00014ffe0 pc=0x47aa01
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr created by google.golang.org/grpc.(*Server).serveStreams.func1 in goroutine 13
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /home/runner/go/pkg/mod/google.golang.org/grpc@v1.59.0/server.go:997 +0x145
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr 
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr goroutine 1 [IO wait]:
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr runtime.gopark(0x4c6b50?, 0xc0001dfb28?, 0x78?, 0xfb?, 0x4e6edd?)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/proc.go:398 +0xce fp=0xc0001dfb08 sp=0xc0001dfae8 pc=0x44be4e
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr runtime.netpollblock(0x478a72?, 0x416e86?, 0x0?)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/netpoll.go:564 +0xf7 fp=0xc0001dfb40 sp=0xc0001dfb08 pc=0x4448d7
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr internal/poll.runtime_pollWait(0x148304689eb0, 0x72)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/netpoll.go:343 +0x85 fp=0xc0001dfb60 sp=0xc0001dfb40 pc=0x475925
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr internal/poll.(*pollDesc).wait(0xc0001a6680?, 0x4?, 0x0)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc0001dfb88 sp=0xc0001dfb60 pc=0x4dfb47
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr internal/poll.(*pollDesc).waitRead(...)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/internal/poll/fd_poll_runtime.go:89
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr internal/poll.(*FD).Accept(0xc0001a6680)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/internal/poll/fd_unix.go:611 +0x2ac fp=0xc0001dfc30 sp=0xc0001dfb88 pc=0x4e502c
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr net.(*netFD).accept(0xc0001a6680)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/net/fd_unix.go:172 +0x29 fp=0xc0001dfce8 sp=0xc0001dfc30 pc=0x640b09
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr net.(*TCPListener).accept(0xc0000aa4c0)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/net/tcpsock_posix.go:152 +0x1e fp=0xc0001dfd10 sp=0xc0001dfce8 pc=0x657abe
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr net.(*TCPListener).Accept(0xc0000aa4c0)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/net/tcpsock.go:315 +0x30 fp=0xc0001dfd40 sp=0xc0001dfd10 pc=0x656c70
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr google.golang.org/grpc.(*Server).Serve(0xc0001ee1e0, {0xa7dc20?, 0xc0000aa4c0})
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /home/runner/go/pkg/mod/google.golang.org/grpc@v1.59.0/server.go:852 +0x462 fp=0xc0001dfe80 sp=0xc0001dfd40 pc=0x7ef322
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr github.com/go-skynet/LocalAI/pkg/grpc.StartServer({0x7ffd1517cf51?, 0xc0000241c0?}, {0xa82260?, 0xc000012630})
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /home/runner/work/LocalAI/LocalAI/pkg/grpc/server.go:178 +0x17d fp=0xc0001dff10 sp=0xc0001dfe80 pc=0x81df5d
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr main.main()
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /home/runner/work/LocalAI/LocalAI/backend/go/llm/llama/main.go:20 +0x85 fp=0xc0001dff40 sp=0xc0001dff10 pc=0x8212c5
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr runtime.main()
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/proc.go:267 +0x2bb fp=0xc0001dffe0 sp=0xc0001dff40 pc=0x44b9fb
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr runtime.goexit()
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc0001dffe8 sp=0xc0001dffe0 pc=0x47aa01
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr 
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr goroutine 2 [force gc (idle)]:
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/proc.go:398 +0xce fp=0xc00008afa8 sp=0xc00008af88 pc=0x44be4e
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr runtime.goparkunlock(...)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/proc.go:404
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr runtime.forcegchelper()
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/proc.go:322 +0xb3 fp=0xc00008afe0 sp=0xc00008afa8 pc=0x44bcd3
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr runtime.goexit()
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc00008afe8 sp=0xc00008afe0 pc=0x47aa01
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr created by runtime.init.6 in goroutine 1
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/proc.go:310 +0x1a
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr 
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr goroutine 3 [GC sweep wait]:
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/proc.go:398 +0xce fp=0xc00008b778 sp=0xc00008b758 pc=0x44be4e
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr runtime.goparkunlock(...)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/proc.go:404
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr runtime.bgsweep(0x0?)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/mgcsweep.go:280 +0x94 fp=0xc00008b7c8 sp=0xc00008b778 pc=0x437d54
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr runtime.gcenable.func1()
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/mgc.go:200 +0x25 fp=0xc00008b7e0 sp=0xc00008b7c8 pc=0x42cf25
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr runtime.goexit()
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc00008b7e8 sp=0xc00008b7e0 pc=0x47aa01
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr created by runtime.gcenable in goroutine 1
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/mgc.go:200 +0x66
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr 
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr goroutine 4 [GC scavenge wait]:
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr runtime.gopark(0xc0000b4000?, 0xa76dc8?, 0x1?, 0x0?, 0xc0000071e0?)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/proc.go:398 +0xce fp=0xc00008bf70 sp=0xc00008bf50 pc=0x44be4e
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr runtime.goparkunlock(...)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/proc.go:404
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr runtime.(*scavengerState).park(0xddb960)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/mgcscavenge.go:425 +0x49 fp=0xc00008bfa0 sp=0xc00008bf70 pc=0x435629
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr runtime.bgscavenge(0x0?)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/mgcscavenge.go:653 +0x3c fp=0xc00008bfc8 sp=0xc00008bfa0 pc=0x435bbc
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr runtime.gcenable.func2()
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/mgc.go:201 +0x25 fp=0xc00008bfe0 sp=0xc00008bfc8 pc=0x42cec5
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr runtime.goexit()
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc00008bfe8 sp=0xc00008bfe0 pc=0x47aa01
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr created by runtime.gcenable in goroutine 1
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/mgc.go:201 +0xa5
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr 
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr goroutine 5 [finalizer wait]:
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr runtime.gopark(0x9c1d00?, 0x10044cf01?, 0x0?, 0x0?, 0x454005?)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/proc.go:398 +0xce fp=0xc00008a628 sp=0xc00008a608 pc=0x44be4e
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr runtime.runfinq()
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/mfinal.go:193 +0x107 fp=0xc00008a7e0 sp=0xc00008a628 pc=0x42bfa7
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr runtime.goexit()
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc00008a7e8 sp=0xc00008a7e0 pc=0x47aa01
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr created by runtime.createfing in goroutine 1
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/mfinal.go:163 +0x3d
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr 
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr goroutine 11 [select]:
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr runtime.gopark(0xc000129f00?, 0x2?, 0x0?, 0x0?, 0xc000129ecc?)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/proc.go:398 +0xce fp=0xc000129d78 sp=0xc000129d58 pc=0x44be4e
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr runtime.selectgo(0xc000129f00, 0xc000129ec8, 0xc000129ee8?, 0x0, 0x95f7a0?, 0x1)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/select.go:327 +0x725 fp=0xc000129e98 sp=0xc000129d78 pc=0x45b8a5
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr google.golang.org/grpc/internal/transport.(*controlBuffer).get(0xc0000c25f0, 0x1)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /home/runner/go/pkg/mod/google.golang.org/grpc@v1.59.0/internal/transport/controlbuf.go:418 +0x113 fp=0xc000129f30 sp=0xc000129e98 pc=0x768893
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr google.golang.org/grpc/internal/transport.(*loopyWriter).run(0xc000116070)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /home/runner/go/pkg/mod/google.golang.org/grpc@v1.59.0/internal/transport/controlbuf.go:552 +0x86 fp=0xc000129f90 sp=0xc000129f30 pc=0x768fc6
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr google.golang.org/grpc/internal/transport.NewServerTransport.func2()
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /home/runner/go/pkg/mod/google.golang.org/grpc@v1.59.0/internal/transport/http2_server.go:336 +0xd5 fp=0xc000129fe0 sp=0xc000129f90 pc=0x77f815
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr runtime.goexit()
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc000129fe8 sp=0xc000129fe0 pc=0x47aa01
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr created by google.golang.org/grpc/internal/transport.NewServerTransport in goroutine 10
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /home/runner/go/pkg/mod/google.golang.org/grpc@v1.59.0/internal/transport/http2_server.go:333 +0x1acc
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr 
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr goroutine 12 [select]:
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr runtime.gopark(0xc00008df70?, 0x4?, 0x0?, 0xa6?, 0xc00008dec0?)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/proc.go:398 +0xce fp=0xc00008dd28 sp=0xc00008dd08 pc=0x44be4e
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr runtime.selectgo(0xc00008df70, 0xc00008deb8, 0x0?, 0x0, 0x0?, 0x1)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/select.go:327 +0x725 fp=0xc00008de48 sp=0xc00008dd28 pc=0x45b8a5
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr google.golang.org/grpc/internal/transport.(*http2Server).keepalive(0xc0001e9040)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /home/runner/go/pkg/mod/google.golang.org/grpc@v1.59.0/internal/transport/http2_server.go:1152 +0x225 fp=0xc00008dfc8 sp=0xc00008de48 pc=0x786ac5
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr google.golang.org/grpc/internal/transport.NewServerTransport.func4()
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /home/runner/go/pkg/mod/google.golang.org/grpc@v1.59.0/internal/transport/http2_server.go:339 +0x25 fp=0xc00008dfe0 sp=0xc00008dfc8 pc=0x77f705
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr runtime.goexit()
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc00008dfe8 sp=0xc00008dfe0 pc=0x47aa01
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr created by google.golang.org/grpc/internal/transport.NewServerTransport in goroutine 10
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /home/runner/go/pkg/mod/google.golang.org/grpc@v1.59.0/internal/transport/http2_server.go:339 +0x1b0e
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr 
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr goroutine 13 [IO wait]:
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr runtime.gopark(0xdf3ac0?, 0xb?, 0x0?, 0x0?, 0x6?)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/proc.go:398 +0xce fp=0xc0000a0aa0 sp=0xc0000a0a80 pc=0x44be4e
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr runtime.netpollblock(0x4c4dd8?, 0x416e86?, 0x0?)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/netpoll.go:564 +0xf7 fp=0xc0000a0ad8 sp=0xc0000a0aa0 pc=0x4448d7
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr internal/poll.runtime_pollWait(0x148304689db8, 0x72)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/netpoll.go:343 +0x85 fp=0xc0000a0af8 sp=0xc0000a0ad8 pc=0x475925
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr internal/poll.(*pollDesc).wait(0xc0001a6800?, 0xc000118000?, 0x0)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc0000a0b20 sp=0xc0000a0af8 pc=0x4dfb47
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr internal/poll.(*pollDesc).waitRead(...)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/internal/poll/fd_poll_runtime.go:89
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr internal/poll.(*FD).Read(0xc0001a6800, {0xc000118000, 0x8000, 0x8000})
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/internal/poll/fd_unix.go:164 +0x27a fp=0xc0000a0bb8 sp=0xc0000a0b20 pc=0x4e0e3a
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr net.(*netFD).Read(0xc0001a6800, {0xc000118000?, 0x1060100000000?, 0x8?})
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/net/fd_posix.go:55 +0x25 fp=0xc0000a0c00 sp=0xc0000a0bb8 pc=0x63eae5
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr net.(*conn).Read(0xc00008e310, {0xc000118000?, 0xc0000a0c90?, 0x3?})
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/net/net.go:179 +0x45 fp=0xc0000a0c48 sp=0xc0000a0c00 pc=0x64f1e5
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr net.(*TCPConn).Read(0x0?, {0xc000118000?, 0xc0000a0ca0?, 0x469d2d?})
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    <autogenerated>:1 +0x25 fp=0xc0000a0c78 sp=0xc0000a0c48 pc=0x661985
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr bufio.(*Reader).Read(0xc0000b93e0, {0xc0001c8120, 0x9, 0xc1571798b5839f33?})
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/bufio/bufio.go:244 +0x197 fp=0xc0000a0cb0 sp=0xc0000a0c78 pc=0x5b9f97
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr io.ReadAtLeast({0xa7b640, 0xc0000b93e0}, {0xc0001c8120, 0x9, 0x9}, 0x9)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/io/io.go:335 +0x90 fp=0xc0000a0cf8 sp=0xc0000a0cb0 pc=0x4befd0
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr io.ReadFull(...)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/io/io.go:354
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr golang.org/x/net/http2.readFrameHeader({0xc0001c8120, 0x9, 0xc000028600?}, {0xa7b640?, 0xc0000b93e0?})
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /home/runner/go/pkg/mod/golang.org/x/net@v0.17.0/http2/frame.go:237 +0x65 fp=0xc0000a0d48 sp=0xc0000a0cf8 pc=0x755305
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr golang.org/x/net/http2.(*Framer).ReadFrame(0xc0001c80e0)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /home/runner/go/pkg/mod/golang.org/x/net@v0.17.0/http2/frame.go:498 +0x85 fp=0xc0000a0df0 sp=0xc0000a0d48 pc=0x755a45
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr google.golang.org/grpc/internal/transport.(*http2Server).HandleStreams(0xc0001e9040, 0x1?)
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /home/runner/go/pkg/mod/google.golang.org/grpc@v1.59.0/internal/transport/http2_server.go:636 +0x145 fp=0xc0000a0f00 sp=0xc0000a0df0 pc=0x782965
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr google.golang.org/grpc.(*Server).serveStreams(0xc0001ee1e0, {0xa81b38?, 0xc0001e9040})
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /home/runner/go/pkg/mod/google.golang.org/grpc@v1.59.0/server.go:979 +0x1c2 fp=0xc0000a0f80 sp=0xc0000a0f00 pc=0x7f0462
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr google.golang.org/grpc.(*Server).handleRawConn.func1()
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /home/runner/go/pkg/mod/google.golang.org/grpc@v1.59.0/server.go:920 +0x45 fp=0xc0000a0fe0 sp=0xc0000a0f80 pc=0x7efcc5
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr runtime.goexit()
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc0000a0fe8 sp=0xc0000a0fe0 pc=0x47aa01
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr created by google.golang.org/grpc.(*Server).handleRawConn in goroutine 10
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr    /home/runner/go/pkg/mod/google.golang.org/grpc@v1.59.0/server.go:919 +0x185
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr 
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr rax    0x0
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr rbx    0xe4ffc0
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr rcx    0x18
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr rdx    0x3dc65656
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr rdi    0x627541
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr rsi    0x7ec52
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr rbp    0xe6ffc0
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr rsp    0x7ffd1517b710
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr r8     0x23
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr r9     0x0
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr r10    0x7ffd151a1080
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr r11    0x3dc65656
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr r12    0xe8ffc0
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr r13    0xeaffc0
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr r14    0xe0ffc0
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr r15    0x0
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr rip    0x86853a
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr rflags 0x10202
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr cs     0x33
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr fs     0x0
5:40AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45083): stderr gs     0x0
[172.17.0.1]:59270 500 - POST /v1/chat/completions

Are you willing to submit PRs to contribute to this bug fix?

that1guy commented 10 months ago

FYI - I'm running this container on top of Unraid.

sozercan commented 10 months ago

@that1guy Thanks for the report!

SIGILL: illegal instruction

This is due to required CPU instruction set not present, do you have minimum AVX supported? You can run grep -e "flags" /proc/cpuinfo | head -1 to get this info and see if avx is present in the flags.

victor-rds commented 10 months ago

I'm getting the same error, docker 24.0.7 on top of Ubuntu 20.04.

My CPU flags (highlighted AVX with quotes):

$ grep -e "flags" /proc/cpuinfo | head -1
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx
fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology
nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm
pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave "avx" f16c rdrand lahf_lm cpuid_fault epb pti ssbd
ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts
md_clear flush_l1d

Request:

$ curl http://mydomain/v1/chat/completions \
> -H "Content-Type: application/json" \
> -d '{ "model": "llama-2-7b-chat", "messages": [{"role": "user", "content": "explain kubernetes in a sentence"}] }'

Response:

{"error":{"code":500,"message":"could not load model: rpc error: code = Unavailable desc = error reading from server: EOF","type":""}}

Complete Logs:

11:50AM DBG no galleries to load
11:50AM INF Starting LocalAI using 4 threads, with models path: /models
11:50AM INF LocalAI version: v2.0.0 (238fec244ae6c9a66bc7fafd76c7e14671110a6f)
11:50AM DBG Model: llama-2-7b-chat (config: {PredictionOptions:{Model:llama-2-7b-chat.Q4_K_M.gguf Language: N:0 TopP:0.7 TopK:80 Temperature:0.2 Maxtokens:0 Echo:false Batch:0 F16:false IgnoreEOS:false RepeatPenalty:0 Keep:0 MirostatETA:0 MirostatTAU:0 Mirostat:0 FrequencyPenalty:0 TFZ:0 TypicalP:0 Seed:0 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:llama-2-7b-chat F16:false Threads:0 Debug:false Roles:map[] Embeddings:false Backend:llama TemplateConfig:{Chat: ChatMessage: Completion: Edit: Functions:} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: FunctionsConfig:{DisableNoAction:false NoActionFunctionName: NoActionDescriptionName:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0 MirostatTAU:0 Mirostat:0 NGPULayers:0 MMap:false MMlock:false LowVRAM:false Grammar: StopWords:[] Cutstrings:[] TrimSpace:[] ContextSize:4096 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: MMProj: RopeScaling: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{PipelineType: SchedulerType: CUDA:false EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} VallE:{AudioPath:}})
11:50AM DBG Extracting backend assets files to /tmp/localai/backend_data

 ┌───────────────────────────────────────────────────┐ 
 │                   Fiber v2.50.0                   │ 
 │               http://127.0.0.1:8080               │ 
 │       (bound on host 0.0.0.0 and port 8080)       │ 
 │                                                   │ 
 │ Handlers ............ 74  Processes ........... 1 │ 
 │ Prefork ....... Disabled  PID ................. 1 │ 
 └───────────────────────────────────────────────────┘ 

11:52AM DBG Request received: 
11:52AM DBG Configuration read: &{PredictionOptions:{Model:llama-2-7b-chat.Q4_K_M.gguf Language: N:0 TopP:0.7 TopK:80 Temperature:0.2 Maxtokens:0 Echo:false Batch:0 F16:false IgnoreEOS:false RepeatPenalty:0 Keep:0 MirostatETA:0 MirostatTAU:0 Mirostat:0 FrequencyPenalty:0 TFZ:0 TypicalP:0 Seed:0 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:llama-2-7b-chat F16:false Threads:4 Debug:true Roles:map[] Embeddings:false Backend:llama TemplateConfig:{Chat: ChatMessage: Completion: Edit: Functions:} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: FunctionsConfig:{DisableNoAction:false NoActionFunctionName: NoActionDescriptionName:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0 MirostatTAU:0 Mirostat:0 NGPULayers:0 MMap:false MMlock:false LowVRAM:false Grammar: StopWords:[] Cutstrings:[] TrimSpace:[] ContextSize:4096 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: MMProj: RopeScaling: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{PipelineType: SchedulerType: CUDA:false EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} VallE:{AudioPath:}}
11:52AM DBG Parameters: &{PredictionOptions:{Model:llama-2-7b-chat.Q4_K_M.gguf Language: N:0 TopP:0.7 TopK:80 Temperature:0.2 Maxtokens:0 Echo:false Batch:0 F16:false IgnoreEOS:false RepeatPenalty:0 Keep:0 MirostatETA:0 MirostatTAU:0 Mirostat:0 FrequencyPenalty:0 TFZ:0 TypicalP:0 Seed:0 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:llama-2-7b-chat F16:false Threads:4 Debug:true Roles:map[] Embeddings:false Backend:llama TemplateConfig:{Chat: ChatMessage: Completion: Edit: Functions:} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: FunctionsConfig:{DisableNoAction:false NoActionFunctionName: NoActionDescriptionName:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0 MirostatTAU:0 Mirostat:0 NGPULayers:0 MMap:false MMlock:false LowVRAM:false Grammar: StopWords:[] Cutstrings:[] TrimSpace:[] ContextSize:4096 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: MMProj: RopeScaling: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{PipelineType: SchedulerType: CUDA:false EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} VallE:{AudioPath:}}
11:52AM DBG Prompt (before templating): explain kubernetes in a sentence
11:52AM DBG Template failed loading: failed loading a template for llama-2-7b-chat.Q4_K_M.gguf
11:52AM DBG Prompt (after templating): explain kubernetes in a sentence
11:52AM DBG Loading model llama from llama-2-7b-chat.Q4_K_M.gguf
11:52AM DBG Loading model in memory from file: /models/llama-2-7b-chat.Q4_K_M.gguf
11:52AM DBG Loading Model llama-2-7b-chat.Q4_K_M.gguf with gRPC (file: /models/llama-2-7b-chat.Q4_K_M.gguf) (backend: llama): {backendString:llama model:llama-2-7b-chat.Q4_K_M.gguf threads:4 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0xc0002aa5a0 externalBackends:map[] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false parallelRequests:false}
11:52AM DBG Loading GRPC Process: /tmp/localai/backend_data/backend-assets/grpc/llama
11:52AM DBG GRPC Service for llama-2-7b-chat.Q4_K_M.gguf will be running at: '127.0.0.1:45427'
11:52AM DBG GRPC Service state dir: /tmp/go-processmanager1396330951
11:52AM DBG GRPC Service Started
rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:45427: connect: connection refused"
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr 2023/12/15 11:52:34 gRPC Server listening at 127.0.0.1:45427
11:52AM DBG GRPC Service Ready
11:52AM DBG GRPC: Loading model with options: {state:{NoUnkeyedLiterals:{} DoNotCompare:[] DoNotCopy:[] atomicMessageInfo:<nil>} sizeCache:0 unknownFields:[] Model:llama-2-7b-chat.Q4_K_M.gguf ContextSize:4096 Seed:0 NBatch:512 F16Memory:false MLock:false MMap:false VocabOnly:false LowVRAM:false Embeddings:false NUMA:false NGPULayers:0 MainGPU: TensorSplit: Threads:4 LibrarySearchPath: RopeFreqBase:0 RopeFreqScale:0 RMSNormEps:0 NGQA:0 ModelFile:/models/llama-2-7b-chat.Q4_K_M.gguf Device: UseTriton:false ModelBaseName: UseFastTokenizer:false PipelineType: SchedulerType: CUDA:false CFGScale:0 IMG2IMG:false CLIPModel: CLIPSubfolder: CLIPSkip:0 Tokenizer: LoraBase: LoraAdapter: LoraScale:0 NoMulMatQ:false DraftModel: AudioPath: Quantization: MMProj: RopeScaling: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0}
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr create_gpt_params: loading model /models/llama-2-7b-chat.Q4_K_M.gguf
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr SIGILL: illegal instruction
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr PC=0x86854d m=0 sigcode=2
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr signal arrived during cgo execution
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr instruction bytes: 0xc4 0xe2 0x71 0xa9 0x15 0xaa 0x79 0x23 0x0 0xc5 0xfa 0x11 0x4c 0x24 0x10 0xc5
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr 
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr goroutine 34 [syscall]:
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr runtime.cgocall(0x821ae0, 0xc0001554d8)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/cgocall.go:157 +0x4b fp=0xc0001554b0 sp=0xc000155478 pc=0x4176eb
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr github.com/go-skynet/go-llama%2ecpp._Cfunc_load_model(0x1501460, 0x1000, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x200, ...)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   _cgo_gotypes.go:266 +0x4f fp=0xc0001554d8 sp=0xc0001554b0 pc=0x8143af
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr github.com/go-skynet/go-llama%2ecpp.New({0xc000118000, 0x23}, {0xc000110240, 0x7, 0x926460?})
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /home/runner/work/LocalAI/LocalAI/sources/go-llama/llama.go:39 +0x385 fp=0xc0001556e8 sp=0xc0001554d8 pc=0x814da5
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr main.(*LLM).Load(0xc000012630, 0xc00014e000)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /home/runner/work/LocalAI/LocalAI/backend/go/llm/llama/llama.go:87 +0xc9c fp=0xc000155900 sp=0xc0001556e8 pc=0x81ed1c
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr github.com/go-skynet/LocalAI/pkg/grpc.(*server).LoadModel(0xc000030d90, {0xc00014e000?, 0x50a886?}, 0x0?)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /home/runner/work/LocalAI/LocalAI/pkg/grpc/server.go:50 +0xe6 fp=0xc0001559b0 sp=0xc000155900 pc=0x81c566
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr github.com/go-skynet/LocalAI/pkg/grpc/proto._Backend_LoadModel_Handler({0x997880?, 0xc000030d90}, {0xa7e610, 0xc00010e390}, 0xc000114100, 0x0)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /home/runner/work/LocalAI/LocalAI/pkg/grpc/proto/backend_grpc.pb.go:264 +0x169 fp=0xc000155a08 sp=0xc0001559b0 pc=0x809829
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr google.golang.org/grpc.(*Server).processUnaryRPC(0xc0001a61e0, {0xa7e610, 0xc00010e2d0}, {0xa81b38, 0xc0001a11e0}, 0xc00013e000, 0xc0001aecc0, 0xd924b0, 0x0)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /home/runner/go/pkg/mod/google.golang.org/grpc@v1.59.0/server.go:1343 +0xe03 fp=0xc000155df0 sp=0xc000155a08 pc=0x7f27c3
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr google.golang.org/grpc.(*Server).handleStream(0xc0001a61e0, {0xa81b38, 0xc0001a11e0}, 0xc00013e000)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /home/runner/go/pkg/mod/google.golang.org/grpc@v1.59.0/server.go:1737 +0xc4c fp=0xc000155f78 sp=0xc000155df0 pc=0x7f772c
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr google.golang.org/grpc.(*Server).serveStreams.func1.1()
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /home/runner/go/pkg/mod/google.golang.org/grpc@v1.59.0/server.go:986 +0x86 fp=0xc000155fe0 sp=0xc000155f78 pc=0x7f06c6
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr runtime.goexit()
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc000155fe8 sp=0xc000155fe0 pc=0x47aa01
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr created by google.golang.org/grpc.(*Server).serveStreams.func1 in goroutine 13
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /home/runner/go/pkg/mod/google.golang.org/grpc@v1.59.0/server.go:997 +0x145
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr 
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr goroutine 1 [IO wait]:
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr runtime.gopark(0x4c6b50?, 0xc000197b28?, 0x78?, 0x7b?, 0x4e6edd?)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/proc.go:398 +0xce fp=0xc000197b08 sp=0xc000197ae8 pc=0x44be4e
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr runtime.netpollblock(0x478a72?, 0x416e86?, 0x0?)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/netpoll.go:564 +0xf7 fp=0xc000197b40 sp=0xc000197b08 pc=0x4448d7
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr internal/poll.runtime_pollWait(0x7f69fc110eb0, 0x72)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/netpoll.go:343 +0x85 fp=0xc000197b60 sp=0xc000197b40 pc=0x475925
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr internal/poll.(*pollDesc).wait(0xc0000ec680?, 0x4?, 0x0)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc000197b88 sp=0xc000197b60 pc=0x4dfb47
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr internal/poll.(*pollDesc).waitRead(...)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/internal/poll/fd_poll_runtime.go:89
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr internal/poll.(*FD).Accept(0xc0000ec680)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/internal/poll/fd_unix.go:611 +0x2ac fp=0xc000197c30 sp=0xc000197b88 pc=0x4e502c
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr net.(*netFD).accept(0xc0000ec680)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/net/fd_unix.go:172 +0x29 fp=0xc000197ce8 sp=0xc000197c30 pc=0x640b09
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr net.(*TCPListener).accept(0xc0000744c0)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/net/tcpsock_posix.go:152 +0x1e fp=0xc000197d10 sp=0xc000197ce8 pc=0x657abe
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr net.(*TCPListener).Accept(0xc0000744c0)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/net/tcpsock.go:315 +0x30 fp=0xc000197d40 sp=0xc000197d10 pc=0x656c70
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr google.golang.org/grpc.(*Server).Serve(0xc0001a61e0, {0xa7dc20?, 0xc0000744c0})
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /home/runner/go/pkg/mod/google.golang.org/grpc@v1.59.0/server.go:852 +0x462 fp=0xc000197e80 sp=0xc000197d40 pc=0x7ef322
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr github.com/go-skynet/LocalAI/pkg/grpc.StartServer({0x7fff1a554de9?, 0xc000024160?}, {0xa82260?, 0xc000012630})
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /home/runner/work/LocalAI/LocalAI/pkg/grpc/server.go:178 +0x17d fp=0xc000197f10 sp=0xc000197e80 pc=0x81df5d
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr main.main()
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /home/runner/work/LocalAI/LocalAI/backend/go/llm/llama/main.go:20 +0x85 fp=0xc000197f40 sp=0xc000197f10 pc=0x8212c5
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr runtime.main()
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/proc.go:267 +0x2bb fp=0xc000197fe0 sp=0xc000197f40 pc=0x44b9fb
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr runtime.goexit()
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc000197fe8 sp=0xc000197fe0 pc=0x47aa01
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr 
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr goroutine 2 [force gc (idle)]:
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/proc.go:398 +0xce fp=0xc000054fa8 sp=0xc000054f88 pc=0x44be4e
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr runtime.goparkunlock(...)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/proc.go:404
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr runtime.forcegchelper()
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/proc.go:322 +0xb3 fp=0xc000054fe0 sp=0xc000054fa8 pc=0x44bcd3
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr runtime.goexit()
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc000054fe8 sp=0xc000054fe0 pc=0x47aa01
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr created by runtime.init.6 in goroutine 1
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/proc.go:310 +0x1a
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr 
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr goroutine 3 [GC sweep wait]:
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/proc.go:398 +0xce fp=0xc000055778 sp=0xc000055758 pc=0x44be4e
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr runtime.goparkunlock(...)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/proc.go:404
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr runtime.bgsweep(0x0?)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/mgcsweep.go:280 +0x94 fp=0xc0000557c8 sp=0xc000055778 pc=0x437d54
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr runtime.gcenable.func1()
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/mgc.go:200 +0x25 fp=0xc0000557e0 sp=0xc0000557c8 pc=0x42cf25
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr runtime.goexit()
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc0000557e8 sp=0xc0000557e0 pc=0x47aa01
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr created by runtime.gcenable in goroutine 1
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/mgc.go:200 +0x66
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr 
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr goroutine 4 [GC scavenge wait]:
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr runtime.gopark(0xc00007e000?, 0xa76dc8?, 0x1?, 0x0?, 0xc0000071e0?)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/proc.go:398 +0xce fp=0xc000055f70 sp=0xc000055f50 pc=0x44be4e
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr runtime.goparkunlock(...)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/proc.go:404
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr runtime.(*scavengerState).park(0xddb960)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/mgcscavenge.go:425 +0x49 fp=0xc000055fa0 sp=0xc000055f70 pc=0x435629
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr runtime.bgscavenge(0x0?)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/mgcscavenge.go:653 +0x3c fp=0xc000055fc8 sp=0xc000055fa0 pc=0x435bbc
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr runtime.gcenable.func2()
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/mgc.go:201 +0x25 fp=0xc000055fe0 sp=0xc000055fc8 pc=0x42cec5
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr runtime.goexit()
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc000055fe8 sp=0xc000055fe0 pc=0x47aa01
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr created by runtime.gcenable in goroutine 1
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/mgc.go:201 +0xa5
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr 
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr goroutine 5 [finalizer wait]:
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr runtime.gopark(0x9c1d00?, 0x10044cf01?, 0x0?, 0x0?, 0x454005?)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/proc.go:398 +0xce fp=0xc000054628 sp=0xc000054608 pc=0x44be4e
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr runtime.runfinq()
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/mfinal.go:193 +0x107 fp=0xc0000547e0 sp=0xc000054628 pc=0x42bfa7
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr runtime.goexit()
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc0000547e8 sp=0xc0000547e0 pc=0x47aa01
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr created by runtime.createfing in goroutine 1
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/mfinal.go:163 +0x3d
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr 
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr goroutine 11 [select]:
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr runtime.gopark(0xc000129f00?, 0x2?, 0x0?, 0x0?, 0xc000129ecc?)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/proc.go:398 +0xce fp=0xc000129d78 sp=0xc000129d58 pc=0x44be4e
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr runtime.selectgo(0xc000129f00, 0xc000129ec8, 0xc000129ee8?, 0x0, 0x95f7a0?, 0x1)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/select.go:327 +0x725 fp=0xc000129e98 sp=0xc000129d78 pc=0x45b8a5
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr google.golang.org/grpc/internal/transport.(*controlBuffer).get(0xc0000885f0, 0x1)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /home/runner/go/pkg/mod/google.golang.org/grpc@v1.59.0/internal/transport/controlbuf.go:418 +0x113 fp=0xc000129f30 sp=0xc000129e98 pc=0x768893
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr google.golang.org/grpc/internal/transport.(*loopyWriter).run(0xc000116070)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /home/runner/go/pkg/mod/google.golang.org/grpc@v1.59.0/internal/transport/controlbuf.go:552 +0x86 fp=0xc000129f90 sp=0xc000129f30 pc=0x768fc6
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr google.golang.org/grpc/internal/transport.NewServerTransport.func2()
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /home/runner/go/pkg/mod/google.golang.org/grpc@v1.59.0/internal/transport/http2_server.go:336 +0xd5 fp=0xc000129fe0 sp=0xc000129f90 pc=0x77f815
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr runtime.goexit()
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc000129fe8 sp=0xc000129fe0 pc=0x47aa01
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr created by google.golang.org/grpc/internal/transport.NewServerTransport in goroutine 10
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /home/runner/go/pkg/mod/google.golang.org/grpc@v1.59.0/internal/transport/http2_server.go:333 +0x1acc
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr 
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr goroutine 12 [select]:
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr runtime.gopark(0xc000057f70?, 0x4?, 0x0?, 0xc1?, 0xc000057ec0?)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/proc.go:398 +0xce fp=0xc000057d28 sp=0xc000057d08 pc=0x44be4e
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr runtime.selectgo(0xc000057f70, 0xc000057eb8, 0x0?, 0x0, 0x0?, 0x1)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/select.go:327 +0x725 fp=0xc000057e48 sp=0xc000057d28 pc=0x45b8a5
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr google.golang.org/grpc/internal/transport.(*http2Server).keepalive(0xc0001a11e0)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /home/runner/go/pkg/mod/google.golang.org/grpc@v1.59.0/internal/transport/http2_server.go:1152 +0x225 fp=0xc000057fc8 sp=0xc000057e48 pc=0x786ac5
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr google.golang.org/grpc/internal/transport.NewServerTransport.func4()
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /home/runner/go/pkg/mod/google.golang.org/grpc@v1.59.0/internal/transport/http2_server.go:339 +0x25 fp=0xc000057fe0 sp=0xc000057fc8 pc=0x77f705
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr runtime.goexit()
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc000057fe8 sp=0xc000057fe0 pc=0x47aa01
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr created by google.golang.org/grpc/internal/transport.NewServerTransport in goroutine 10
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /home/runner/go/pkg/mod/google.golang.org/grpc@v1.59.0/internal/transport/http2_server.go:339 +0x1b0e
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr 
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr goroutine 13 [IO wait]:
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr runtime.gopark(0x100000000?, 0xb?, 0x0?, 0x0?, 0x6?)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/proc.go:398 +0xce fp=0xc00006aaa0 sp=0xc00006aa80 pc=0x44be4e
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr runtime.netpollblock(0x4c4dd8?, 0x416e86?, 0x0?)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/netpoll.go:564 +0xf7 fp=0xc00006aad8 sp=0xc00006aaa0 pc=0x4448d7
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr internal/poll.runtime_pollWait(0x7f69fc110db8, 0x72)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/netpoll.go:343 +0x85 fp=0xc00006aaf8 sp=0xc00006aad8 pc=0x475925
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr internal/poll.(*pollDesc).wait(0xc0000ec800?, 0xc0001dc000?, 0x0)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc00006ab20 sp=0xc00006aaf8 pc=0x4dfb47
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr internal/poll.(*pollDesc).waitRead(...)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/internal/poll/fd_poll_runtime.go:89
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr internal/poll.(*FD).Read(0xc0000ec800, {0xc0001dc000, 0x8000, 0x8000})
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/internal/poll/fd_unix.go:164 +0x27a fp=0xc00006abb8 sp=0xc00006ab20 pc=0x4e0e3a
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr net.(*netFD).Read(0xc0000ec800, {0xc0001dc000?, 0x1060100000000?, 0x8?})
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/net/fd_posix.go:55 +0x25 fp=0xc00006ac00 sp=0xc00006abb8 pc=0x63eae5
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr net.(*conn).Read(0xc000058310, {0xc0001dc000?, 0xc00006ac90?, 0x3?})
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/net/net.go:179 +0x45 fp=0xc00006ac48 sp=0xc00006ac00 pc=0x64f1e5
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr net.(*TCPConn).Read(0x0?, {0xc0001dc000?, 0xc00006aca0?, 0x469d2d?})
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   <autogenerated>:1 +0x25 fp=0xc00006ac78 sp=0xc00006ac48 pc=0x661985
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr bufio.(*Reader).Read(0xc000026ba0, {0xc0001da120, 0x9, 0xc1572d6105517ab1?})
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/bufio/bufio.go:244 +0x197 fp=0xc00006acb0 sp=0xc00006ac78 pc=0x5b9f97
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr io.ReadAtLeast({0xa7b640, 0xc000026ba0}, {0xc0001da120, 0x9, 0x9}, 0x9)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/io/io.go:335 +0x90 fp=0xc00006acf8 sp=0xc00006acb0 pc=0x4befd0
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr io.ReadFull(...)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/io/io.go:354
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr golang.org/x/net/http2.readFrameHeader({0xc0001da120, 0x9, 0xc000218000?}, {0xa7b640?, 0xc000026ba0?})
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /home/runner/go/pkg/mod/golang.org/x/net@v0.17.0/http2/frame.go:237 +0x65 fp=0xc00006ad48 sp=0xc00006acf8 pc=0x755305
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr golang.org/x/net/http2.(*Framer).ReadFrame(0xc0001da0e0)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /home/runner/go/pkg/mod/golang.org/x/net@v0.17.0/http2/frame.go:498 +0x85 fp=0xc00006adf0 sp=0xc00006ad48 pc=0x755a45
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr google.golang.org/grpc/internal/transport.(*http2Server).HandleStreams(0xc0001a11e0, 0x1?)
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /home/runner/go/pkg/mod/google.golang.org/grpc@v1.59.0/internal/transport/http2_server.go:636 +0x145 fp=0xc00006af00 sp=0xc00006adf0 pc=0x782965
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr google.golang.org/grpc.(*Server).serveStreams(0xc0001a61e0, {0xa81b38?, 0xc0001a11e0})
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /home/runner/go/pkg/mod/google.golang.org/grpc@v1.59.0/server.go:979 +0x1c2 fp=0xc00006af80 sp=0xc00006af00 pc=0x7f0462
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr google.golang.org/grpc.(*Server).handleRawConn.func1()
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /home/runner/go/pkg/mod/google.golang.org/grpc@v1.59.0/server.go:920 +0x45 fp=0xc00006afe0 sp=0xc00006af80 pc=0x7efcc5
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr runtime.goexit()
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /opt/hostedtoolcache/go/1.21.4/x64/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc00006afe8 sp=0xc00006afe0 pc=0x47aa01
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr created by google.golang.org/grpc.(*Server).handleRawConn in goroutine 10
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr   /home/runner/go/pkg/mod/google.golang.org/grpc@v1.59.0/server.go:919 +0x185
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr 
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr rax    0x0
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr rbx    0xe4ffc0
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr rcx    0x18
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr rdx    0x1966912
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr rdi    0x1
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr rsi    0x7fff1a552a30
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr rbp    0xe6ffc0
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr rsp    0x7fff1a552a10
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr r8     0x7fff1a5ec080
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr r9     0xe7d7
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr r10    0x7fff1a5ec090
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr r11    0x1966912
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr r12    0xe8ffc0
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr r13    0xeaffc0
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr r14    0xe0ffc0
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr r15    0x0
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr rip    0x86854d
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr rflags 0x10202
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr cs     0x33
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr fs     0x0
11:52AM DBG GRPC(llama-2-7b-chat.Q4_K_M.gguf-127.0.0.1:45427): stderr gs     0x0
[192.168.136.2]:57080 500 - POST /v1/chat/completions

If need more information I'm glad to help

sozercan commented 10 months ago

@victor-rds thanks for the report! can you check whether same output is there inside a container? something like docker run ubuntu:22.04 grep -e "flags" /proc/cpuinfo | head -1

another thing you can check whether same issue exists by using localai binary directly through https://github.com/mudler/LocalAI/releases/tag/v2.0.0

victor-rds commented 10 months ago

The container shows avx:

docker run --rm -it ubuntu:22.04 /bin/grep -e "flags" /proc/cpuinfo | head -1
Unable to find image 'ubuntu:22.04' locally
22.04: Pulling from library/ubuntu
5e8117c0bd28: Already exists
Digest: sha256:8eab65df33a6de2844c9aefd19efe8ddb87b7df5e9185a4ab73af936225685bb
Status: Downloaded newer image for ubuntu:22.04
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx
fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology
nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid
sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave "avx" f16c rdrand lahf_lm cpuid_fault epb pti ssbd ibrs ibpb
stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts md_clear flush_l1d

About the second request, the binaries are compiled with a version of GLIBC that's incompatible with my host, running a old version of ubuntu, I didn't had time to to fix I will try and send the results.

that1guy commented 10 months ago

The flag are identical on my server and within Docker and the avx flag is present.

Running grep -e "flags" /proc/cpuinfo | head -1 in Unraid console gives me:

flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm epb pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid xsaveopt dtherm ida arat pln pts md_clear flush_l1d

Docker command docker run ubuntu:22.04 grep -e "flags" /proc/cpuinfo | head -1 gives me:

Unable to find image 'ubuntu:22.04' locally
22.04: Pulling from library/ubuntu
5e8117c0bd28: Pulling fs layer
5e8117c0bd28: Verifying Checksum
5e8117c0bd28: Download complete
5e8117c0bd28: Pull complete
Digest: sha256:8eab65df33a6de2844c9aefd19efe8ddb87b7df5e9185a4ab73af936225685bb
Status: Downloaded newer image for ubuntu:22.04
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm epb pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid xsaveopt dtherm ida arat pln pts md_clear flush_l1d
that1guy commented 10 months ago

FYI - Opened ticket in LocalAI repo after fiddling with CPU Flags and rebuilds. Still no luck. https://github.com/mudler/LocalAI/issues/1453

sozercan commented 10 months ago

Since you were able to repro with local-ai locally, this is not an aikit specific issue. I'll close this for now. If this is addressed in future versions of localai, aikit will automatically get this update.