mudler / LocalAI

:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. Features: Generate Text, Audio, Video, Images, Voice Cloning, Distributed inference
https://localai.io
MIT License
23.2k stars 1.76k forks source link

local-ai failed loading ggml-gpt4all-j model #181

Closed yaakov-berkovitch closed 11 months ago

yaakov-berkovitch commented 1 year ago

Hi, I'm running local-ai in Kubernetes and download the model ggml-gpt4all-j in the same way as explained here, but got this error:

 ┌────────────────────────────────────────────────────┐ 
 │                   Fiber v2.44.0                    │ 
 │               http://127.0.0.1:8080                │ 
 │       (bound on host 0.0.0.0 and port 8080)        │ 
 │                                                    │ 
 │ Handlers ............ 12  Processes ........... 1  │ 
 │ Prefork ....... Disabled  PID .............. 2975  │ 
 └────────────────────────────────────────────────────┘ 

llama.cpp: loading model from /models/ggml-gpt4all-j.bin
error loading model: unexpectedly reached end of file
llama_init_from_file: failed to load model

I tried with latest and also v1.6.1 quay.io/go-skynet/local-ai image and got the same error.

Any ideas ?

fHachenberg commented 1 year ago

Corrupt file perhaps?

yaakov-berkovitch commented 1 year ago

@fHachenberg i checked the file md5 after the download and it was correct - So you mean that it can be corrupted there https://gpt4all.io/models/ggml-gpt4all-j.bin ?

mudler commented 1 year ago

Can you run it in debug mode (--debug or DEBUG=true) and collect logs again?

kaitallaoua commented 1 year ago

I have same issue. Running ubuntu (22.04) vm, with latest image (v1.13.0) in docker (23.0.6) with compose, no Kubernetes. Following this guide. Here is my log:

chatbot-ui-api-1      |  ┌───────────────────────────────────────────────────┐ 
chatbot-ui-api-1      |  │                   Fiber v2.45.0                   │ 
chatbot-ui-api-1      |  │               http://127.0.0.1:8080               │ 
chatbot-ui-api-1      |  │       (bound on host 0.0.0.0 and port 8080)       │ 
chatbot-ui-api-1      |  │                                                   │ 
chatbot-ui-api-1      |  │ Handlers ............ 21  Processes ........... 1 │ 
chatbot-ui-api-1      |  │ Prefork ....... Disabled  PID .............. 1240 │ 
chatbot-ui-api-1      |  └───────────────────────────────────────────────────┘ 
chatbot-ui-api-1      | 
chatbot-ui-api-1      | [172.26.0.1]:54190  200  -  GET      /v1/models
chatbot-ui-api-1      | 12:47AM DBG Request received: {"model":"ggml-gpt4all-j","file":"","language":"","response_format":"","size":"","prompt":null,"instruction":"","input":null,"stop":null,"messages":[{"role":"user","content":"How are you?"}],"stream":false,"echo":false,"top_p":0,"top_k":0,"temperature":0.9,"max_tokens":0,"n":0,"batch":0,"f16":false,"ignore_eos":false,"repeat_penalty":0,"n_keep":0,"mirostat_eta":0,"mirostat_tau":0,"mirostat":0,"seed":0,"mode":0,"step":0}
chatbot-ui-api-1      | 12:47AM DBG Parameter Config: &{OpenAIRequest:{Model:ggml-gpt4all-j File: Language: ResponseFormat: Size: Prompt:<nil> Instruction: Input:<nil> Stop:<nil> Messages:[] Stream:false Echo:false TopP:0.7 TopK:80 Temperature:0.9 Maxtokens:512 N:0 Batch:0 F16:false IgnoreEOS:false RepeatPenalty:0 Keep:0 MirostatETA:0 MirostatTAU:0 Mirostat:0 Seed:0 Mode:0 Step:0} Name: StopWords:[] Cutstrings:[] TrimSpace:[] ContextSize:512 F16:false Threads:4 Debug:true Roles:map[] Embeddings:false Backend: TemplateConfig:{Completion: Chat: Edit:} MirostatETA:0 MirostatTAU:0 Mirostat:0 NGPULayers:0 ImageGenerationAssets: PromptStrings:[] InputStrings:[] InputToken:[]}
chatbot-ui-api-1      | 12:47AM DBG Template found, input modified to: The prompt below is a question to answer, a task to complete, or a conversation to respond to; decide which and write an appropriate response.
chatbot-ui-api-1      | ### Prompt:
chatbot-ui-api-1      | How are you?
chatbot-ui-api-1      | ### Response:
chatbot-ui-api-1      | 
chatbot-ui-api-1      | 12:47AM DBG Loading model 'ggml-gpt4all-j' greedly
chatbot-ui-api-1      | 12:47AM DBG [llama] Attempting to load
chatbot-ui-api-1      | 12:47AM DBG Loading model llama from ggml-gpt4all-j
chatbot-ui-api-1      | 12:47AM DBG Loading model in memory from file: /models/ggml-gpt4all-j
chatbot-ui-api-1      | llama.cpp: loading model from /models/ggml-gpt4all-j
chatbot-ui-api-1      | failed unexpectedly reached end of file12:48AM DBG [llama] Fails: failed loading model
chatbot-ui-api-1      | 12:48AM DBG [gpt4all-llama] Attempting to load
chatbot-ui-api-1      | 12:48AM DBG Loading model gpt4all-llama from ggml-gpt4all-j
chatbot-ui-api-1      | 12:48AM DBG Loading model in memory from file: /models/ggml-gpt4all-j
chatbot-ui-api-1      | SIGILL: illegal instruction
chatbot-ui-api-1      | PC=0xafaf64 m=0 sigcode=2
chatbot-ui-api-1      | signal arrived during cgo execution
chatbot-ui-api-1      | instruction bytes: 0xc4 0xc3 0x7d 0x39 0x86 0xd8 0x13 0x0 0x0 0x1 0x49 0x89 0x86 0xb8 0x14 0x0
chatbot-ui-api-1      | 
chatbot-ui-api-1      | goroutine 21 [syscall]:
chatbot-ui-api-1      | runtime.cgocall(0x9e8010, 0xc000168268)
chatbot-ui-api-1      |         /usr/local/go/src/runtime/cgocall.go:157 +0x5c fp=0xc000168240 sp=0xc000168208 pc=0x44a59c
chatbot-ui-api-1      | github.com/nomic-ai/gpt4all/gpt4all-bindings/golang._Cfunc_load_gptjllama_model(0x2ddea80, 0x4)
chatbot-ui-api-1      |         _cgo_gotypes.go:137 +0x4d fp=0xc000168268 sp=0xc000168240 pc=0x5889cd
chatbot-ui-api-1      | github.com/nomic-ai/gpt4all/gpt4all-bindings/golang.New({0xc0000d6558, 0x16}, {0xc0001c5830, 0x2, 0x1?})
chatbot-ui-api-1      |         /build/gpt4all/gpt4all-bindings/golang/gpt4all.go:35 +0x145 fp=0xc0001682c0 sp=0xc000168268 pc=0x588ce5
chatbot-ui-api-1      | github.com/go-skynet/LocalAI/pkg/model.gpt4allLM.func1({0xc0000d6558?, 0xc4edff?})
chatbot-ui-api-1      |         /build/pkg/model/initializers.go:110 +0x2a fp=0xc0001682f8 sp=0xc0001682c0 pc=0x608aaa
chatbot-ui-api-1      | github.com/go-skynet/LocalAI/pkg/model.(*ModelLoader).LoadModel(0xc0001609f0, {0xc0001ced40, 0xe}, 0xc0000af080)
chatbot-ui-api-1      |         /build/pkg/model/loader.go:127 +0x1fe fp=0xc0001683f0 sp=0xc0001682f8 pc=0x60aabe
chatbot-ui-api-1      | github.com/go-skynet/LocalAI/pkg/model.(*ModelLoader).BackendLoader(0xc0001609f0, {0xc3e0fd, 0xd}, {0xc0001ced40, 0xe}, {0xc0000aa608, 0x1, 0x1}, 0x4)
chatbot-ui-api-1      |         /build/pkg/model/initializers.go:150 +0x7d2 fp=0xc0001684b8 sp=0xc0001683f0 pc=0x6094b2
chatbot-ui-api-1      | github.com/go-skynet/LocalAI/pkg/model.(*ModelLoader).GreedyLoader(0xc0001609f0, {0xc0001ced40, 0xe}, {0xc0000aa608, 0x1, 0x1}, 0x0?)
chatbot-ui-api-1      |         /build/pkg/model/initializers.go:184 +0x3a5 fp=0xc000168600 sp=0xc0001684b8 pc=0x609ac5
chatbot-ui-api-1      | github.com/go-skynet/LocalAI/api.ModelInference({_, _}, _, {{{0xc0001ced40, 0xe}, {0x0, 0x0}, {0x0, 0x0}, {0x0, ...}, ...}, ...}, ...)
chatbot-ui-api-1      |         /build/api/prediction.go:218 +0x145 fp=0xc0001688b0 sp=0xc000168600 pc=0x944a45
chatbot-ui-api-1      | github.com/go-skynet/LocalAI/api.ComputeChoices({0xc0002860c0, 0xb6}, 0xc0000f6dc0, 0xc000282280, 0xc0001c52f0?, 0xc7a300, 0x4?)
chatbot-ui-api-1      |         /build/api/prediction.go:517 +0x138 fp=0xc000169060 sp=0xc0001688b0 pc=0x948118
chatbot-ui-api-1      | github.com/go-skynet/LocalAI/api.chatEndpoint.func2(0xc0000deb00)
chatbot-ui-api-1      |         /build/api/openai.go:361 +0x8ec fp=0xc000169220 sp=0xc000169060 pc=0x93fd0c
chatbot-ui-api-1      | github.com/gofiber/fiber/v2.(*App).next(0xc0000e3200, 0xc0000deb00)
chatbot-ui-api-1      |         /go/pkg/mod/github.com/gofiber/fiber/v2@v2.45.0/router.go:144 +0x1bf fp=0xc0001692c8 sp=0xc000169220 pc=0x8c4e5f
chatbot-ui-api-1      | github.com/gofiber/fiber/v2.(*Ctx).Next(0xc0001f2330?)
chatbot-ui-api-1      |         /go/pkg/mod/github.com/gofiber/fiber/v2@v2.45.0/ctx.go:913 +0x53 fp=0xc0001692e8 sp=0xc0001692c8 pc=0x8b0433
chatbot-ui-api-1      | github.com/gofiber/fiber/v2/middleware/cors.New.func1(0xc0000deb00)
chatbot-ui-api-1      |         /go/pkg/mod/github.com/gofiber/fiber/v2@v2.45.0/middleware/cors/cors.go:162 +0x3da fp=0xc0001693f0 sp=0xc0001692e8 pc=0x8cac7a
chatbot-ui-api-1      | github.com/gofiber/fiber/v2.(*Ctx).Next(0x0?)
chatbot-ui-api-1      |         /go/pkg/mod/github.com/gofiber/fiber/v2@v2.45.0/ctx.go:910 +0x43 fp=0xc000169410 sp=0xc0001693f0 pc=0x8b0423
chatbot-ui-api-1      | github.com/gofiber/fiber/v2/middleware/recover.New.func1(0x1?)
chatbot-ui-api-1      |         /go/pkg/mod/github.com/gofiber/fiber/v2@v2.45.0/middleware/recover/recover.go:43 +0xcb fp=0xc000169488 sp=0xc000169410 pc=0x8d18ab
chatbot-ui-api-1      | github.com/gofiber/fiber/v2.(*Ctx).Next(0xc000160a80?)
chatbot-ui-api-1      |         /go/pkg/mod/github.com/gofiber/fiber/v2@v2.45.0/ctx.go:910 +0x43 fp=0xc0001694a8 sp=0xc000169488 pc=0x8b0423
chatbot-ui-api-1      | github.com/gofiber/fiber/v2/middleware/logger.New.func3(0xc0000deb00)
chatbot-ui-api-1      |         /go/pkg/mod/github.com/gofiber/fiber/v2@v2.45.0/middleware/logger/logger.go:121 +0x395 fp=0xc000169b30 sp=0xc0001694a8 pc=0x8cc4f5
chatbot-ui-api-1      | github.com/gofiber/fiber/v2.(*App).next(0xc0000e3200, 0xc0000deb00)
chatbot-ui-api-1      |         /go/pkg/mod/github.com/gofiber/fiber/v2@v2.45.0/router.go:144 +0x1bf fp=0xc000169bd8 sp=0xc000169b30 pc=0x8c4e5f
chatbot-ui-api-1      | github.com/gofiber/fiber/v2.(*App).handler(0xc0000e3200, 0x4cf3b7?)
chatbot-ui-api-1      |         /go/pkg/mod/github.com/gofiber/fiber/v2@v2.45.0/router.go:171 +0x87 fp=0xc000169c38 sp=0xc000169bd8 pc=0x8c50a7
chatbot-ui-api-1      | github.com/gofiber/fiber/v2.(*App).handler-fm(0xc0001f2000?)
chatbot-ui-api-1      |         <autogenerated>:1 +0x2c fp=0xc000169c58 sp=0xc000169c38 pc=0x8ca2cc
chatbot-ui-api-1      | github.com/valyala/fasthttp.(*Server).serveConn(0xc0001b0400, {0xd0b240?, 0xc0000aa538})
chatbot-ui-api-1      |         /go/pkg/mod/github.com/valyala/fasthttp@v1.47.0/server.go:2365 +0x11d3 fp=0xc000169ec8 sp=0xc000169c58 pc=0x84b053
chatbot-ui-api-1      | github.com/valyala/fasthttp.(*Server).serveConn-fm({0xd0b240?, 0xc0000aa538?})
chatbot-ui-api-1      |         <autogenerated>:1 +0x39 fp=0xc000169ef0 sp=0xc000169ec8 pc=0x85a919
chatbot-ui-api-1      | github.com/valyala/fasthttp.(*workerPool).workerFunc(0xc0000bd900, 0xc0000aee40)
chatbot-ui-api-1      |         /go/pkg/mod/github.com/valyala/fasthttp@v1.47.0/workerpool.go:224 +0xa9 fp=0xc000169fa0 sp=0xc000169ef0 pc=0x856b49
chatbot-ui-api-1      | github.com/valyala/fasthttp.(*workerPool).getCh.func1()
chatbot-ui-api-1      |         /go/pkg/mod/github.com/valyala/fasthttp@v1.47.0/workerpool.go:196 +0x38 fp=0xc000169fe0 sp=0xc000169fa0 pc=0x8568b8
chatbot-ui-api-1      | runtime.goexit()
chatbot-ui-api-1      |         /usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc000169fe8 sp=0xc000169fe0 pc=0x4ad1a1
chatbot-ui-api-1      | created by github.com/valyala/fasthttp.(*workerPool).getCh
chatbot-ui-api-1      |         /go/pkg/mod/github.com/valyala/fasthttp@v1.47.0/workerpool.go:195 +0x1b0
chatbot-ui-api-1      | 
chatbot-ui-api-1      | goroutine 1 [IO wait]:
chatbot-ui-api-1      | runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
chatbot-ui-api-1      |         /usr/local/go/src/runtime/proc.go:381 +0xd6 fp=0xc00018f3f8 sp=0xc00018f3d8 pc=0x47e396
chatbot-ui-api-1      | runtime.netpollblock(0x7fa47ec7e8a8?, 0x449c2f?, 0x0?)
chatbot-ui-api-1      |         /usr/local/go/src/runtime/netpoll.go:527 +0xf7 fp=0xc00018f430 sp=0xc00018f3f8 pc=0x476cf7
chatbot-ui-api-1      | internal/poll.runtime_pollWait(0x7fa456101b98, 0x72)
chatbot-ui-api-1      |         /usr/local/go/src/runtime/netpoll.go:306 +0x89 fp=0xc00018f450 sp=0xc00018f430 pc=0x4a7a49
chatbot-ui-api-1      | internal/poll.(*pollDesc).wait(0xc0000fed00?, 0x4?, 0x0)
chatbot-ui-api-1      |         /usr/local/go/src/internal/poll/fd_poll_runtime.go:84 +0x32 fp=0xc00018f478 sp=0xc00018f450 pc=0x51e7b2
chatbot-ui-api-1      | internal/poll.(*pollDesc).waitRead(...)
chatbot-ui-api-1      |         /usr/local/go/src/internal/poll/fd_poll_runtime.go:89
chatbot-ui-api-1      | internal/poll.(*FD).Accept(0xc0000fed00)
chatbot-ui-api-1      |         /usr/local/go/src/internal/poll/fd_unix.go:614 +0x2bd fp=0xc00018f520 sp=0xc00018f478 pc=0x5240bd
chatbot-ui-api-1      | net.(*netFD).accept(0xc0000fed00)
chatbot-ui-api-1      |         /usr/local/go/src/net/fd_unix.go:172 +0x35 fp=0xc00018f5d8 sp=0xc00018f520 pc=0x5a9855
chatbot-ui-api-1      | net.(*TCPListener).accept(0xc0000a8840)
chatbot-ui-api-1      |         /usr/local/go/src/net/tcpsock_posix.go:148 +0x25 fp=0xc00018f600 sp=0xc00018f5d8 pc=0x5bfc05
chatbot-ui-api-1      | net.(*TCPListener).Accept(0xc0000a8840)
chatbot-ui-api-1      |         /usr/local/go/src/net/tcpsock.go:297 +0x3d fp=0xc00018f630 sp=0xc00018f600 pc=0x5becfd
chatbot-ui-api-1      | github.com/valyala/fasthttp.acceptConn(0xc0001b0400, {0xd08860, 0xc0000a8840}, 0xc00018f828)
chatbot-ui-api-1      |         /go/pkg/mod/github.com/valyala/fasthttp@v1.47.0/server.go:1930 +0x62 fp=0xc00018f710 sp=0xc00018f630 pc=0x849522
chatbot-ui-api-1      | github.com/valyala/fasthttp.(*Server).Serve(0xc0001b0400, {0xd08860?, 0xc0000a8840})
chatbot-ui-api-1      |         /go/pkg/mod/github.com/valyala/fasthttp@v1.47.0/server.go:1823 +0x4f4 fp=0xc00018f858 sp=0xc00018f710 pc=0x848b34
chatbot-ui-api-1      | github.com/gofiber/fiber/v2.(*App).Listen(0xc0000e3200, {0xc34934?, 0x7?})
chatbot-ui-api-1      |         /go/pkg/mod/github.com/gofiber/fiber/v2@v2.45.0/listen.go:82 +0x110 fp=0xc00018f8b8 sp=0xc00018f858 pc=0x8bbf50
chatbot-ui-api-1      | main.main.func1(0xc00018fbc8?)
chatbot-ui-api-1      |         /build/main.go:97 +0x345 fp=0xc00018f9b8 sp=0xc00018f8b8 pc=0x9761c5
chatbot-ui-api-1      | github.com/urfave/cli/v2.(*Command).Run(0xc0001b8160, 0xc0000e4900, {0xc0000ae000, 0x2, 0x2})
chatbot-ui-api-1      |         /go/pkg/mod/github.com/urfave/cli/v2@v2.25.3/command.go:274 +0x9eb fp=0xc00018fc58 sp=0xc00018f9b8 pc=0x9640cb
chatbot-ui-api-1      | github.com/urfave/cli/v2.(*App).RunContext(0xc0001b4000, {0xd08bc8?, 0xc0000a0000}, {0xc0000ae000, 0x2, 0x2})
chatbot-ui-api-1      |         /go/pkg/mod/github.com/urfave/cli/v2@v2.25.3/app.go:332 +0x616 fp=0xc00018fcc8 sp=0xc00018fc58 pc=0x960ed6
chatbot-ui-api-1      | github.com/urfave/cli/v2.(*App).Run(...)
chatbot-ui-api-1      |         /go/pkg/mod/github.com/urfave/cli/v2@v2.25.3/app.go:309
chatbot-ui-api-1      | main.main()
chatbot-ui-api-1      |         /build/main.go:101 +0xbae fp=0xc00018ff80 sp=0xc00018fcc8 pc=0x975dae
chatbot-ui-api-1      | runtime.main()
chatbot-ui-api-1      |         /usr/local/go/src/runtime/proc.go:250 +0x207 fp=0xc00018ffe0 sp=0xc00018ff80 pc=0x47df67
chatbot-ui-api-1      | runtime.goexit()
chatbot-ui-api-1      |         /usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc00018ffe8 sp=0xc00018ffe0 pc=0x4ad1a1
chatbot-ui-api-1      | 
chatbot-ui-api-1      | goroutine 2 [force gc (idle)]:
chatbot-ui-api-1      | runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
chatbot-ui-api-1      |         /usr/local/go/src/runtime/proc.go:381 +0xd6 fp=0xc00003efb0 sp=0xc00003ef90 pc=0x47e396
chatbot-ui-api-1      | runtime.goparkunlock(...)
chatbot-ui-api-1      |         /usr/local/go/src/runtime/proc.go:387
chatbot-ui-api-1      | runtime.forcegchelper()
chatbot-ui-api-1      |         /usr/local/go/src/runtime/proc.go:305 +0xb0 fp=0xc00003efe0 sp=0xc00003efb0 pc=0x47e1d0
chatbot-ui-api-1      | runtime.goexit()
chatbot-ui-api-1      |         /usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc00003efe8 sp=0xc00003efe0 pc=0x4ad1a1
chatbot-ui-api-1      | created by runtime.init.6
chatbot-ui-api-1      |         /usr/local/go/src/runtime/proc.go:293 +0x25
chatbot-ui-api-1      | 
chatbot-ui-api-1      | goroutine 3 [GC sweep wait]:
chatbot-ui-api-1      | runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
chatbot-ui-api-1      |         /usr/local/go/src/runtime/proc.go:381 +0xd6 fp=0xc00003f780 sp=0xc00003f760 pc=0x47e396
chatbot-ui-api-1      | runtime.goparkunlock(...)
chatbot-ui-api-1      |         /usr/local/go/src/runtime/proc.go:387
chatbot-ui-api-1      | runtime.bgsweep(0x0?)
chatbot-ui-api-1      |         /usr/local/go/src/runtime/mgcsweep.go:278 +0x8e fp=0xc00003f7c8 sp=0xc00003f780 pc=0x46a5ae
chatbot-ui-api-1      | runtime.gcenable.func1()
chatbot-ui-api-1      |         /usr/local/go/src/runtime/mgc.go:178 +0x26 fp=0xc00003f7e0 sp=0xc00003f7c8 pc=0x45f866
chatbot-ui-api-1      | runtime.goexit()
chatbot-ui-api-1      |         /usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc00003f7e8 sp=0xc00003f7e0 pc=0x4ad1a1
chatbot-ui-api-1      | created by runtime.gcenable
chatbot-ui-api-1      |         /usr/local/go/src/runtime/mgc.go:178 +0x6b
chatbot-ui-api-1      | 
chatbot-ui-api-1      | goroutine 4 [GC scavenge wait]:
chatbot-ui-api-1      | runtime.gopark(0xc000066000?, 0xd00e88?, 0x1?, 0x0?, 0x0?)
chatbot-ui-api-1      |         /usr/local/go/src/runtime/proc.go:381 +0xd6 fp=0xc00003ff70 sp=0xc00003ff50 pc=0x47e396
chatbot-ui-api-1      | runtime.goparkunlock(...)
chatbot-ui-api-1      |         /usr/local/go/src/runtime/proc.go:387
chatbot-ui-api-1      | runtime.(*scavengerState).park(0x112cc00)
chatbot-ui-api-1      |         /usr/local/go/src/runtime/mgcscavenge.go:400 +0x53 fp=0xc00003ffa0 sp=0xc00003ff70 pc=0x4684d3
chatbot-ui-api-1      | runtime.bgscavenge(0x0?)
chatbot-ui-api-1      |         /usr/local/go/src/runtime/mgcscavenge.go:628 +0x45 fp=0xc00003ffc8 sp=0xc00003ffa0 pc=0x468aa5
chatbot-ui-api-1      | runtime.gcenable.func2()
chatbot-ui-api-1      |         /usr/local/go/src/runtime/mgc.go:179 +0x26 fp=0xc00003ffe0 sp=0xc00003ffc8 pc=0x45f806
chatbot-ui-api-1      | runtime.goexit()
chatbot-ui-api-1      |         /usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc00003ffe8 sp=0xc00003ffe0 pc=0x4ad1a1
chatbot-ui-api-1      | created by runtime.gcenable
chatbot-ui-api-1      |         /usr/local/go/src/runtime/mgc.go:179 +0xaa
chatbot-ui-api-1      | 
chatbot-ui-api-1      | goroutine 18 [finalizer wait]:
chatbot-ui-api-1      | runtime.gopark(0x1a0?, 0x112d8e0?, 0xe0?, 0x24?, 0xc00003e770?)
chatbot-ui-api-1      |         /usr/local/go/src/runtime/proc.go:381 +0xd6 fp=0xc00003e628 sp=0xc00003e608 pc=0x47e396
chatbot-ui-api-1      | runtime.runfinq()
chatbot-ui-api-1      |         /usr/local/go/src/runtime/mfinal.go:193 +0x107 fp=0xc00003e7e0 sp=0xc00003e628 pc=0x45e8a7
chatbot-ui-api-1      | runtime.goexit()
chatbot-ui-api-1      |         /usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc00003e7e8 sp=0xc00003e7e0 pc=0x4ad1a1
chatbot-ui-api-1      | created by runtime.createfing
chatbot-ui-api-1      |         /usr/local/go/src/runtime/mfinal.go:163 +0x45
chatbot-ui-api-1      | 
chatbot-ui-api-1      | goroutine 19 [select]:
chatbot-ui-api-1      | runtime.gopark(0xc00003a720?, 0x2?, 0x0?, 0x0?, 0xc00003a67c?)
chatbot-ui-api-1      |         /usr/local/go/src/runtime/proc.go:381 +0xd6 fp=0xc000051cd0 sp=0xc000051cb0 pc=0x47e396
chatbot-ui-api-1      | runtime.selectgo(0xc000051f20, 0xc00003a678, 0x0?, 0x0, 0x0?, 0x1)
chatbot-ui-api-1      |         /usr/local/go/src/runtime/select.go:327 +0x7be fp=0xc000051e10 sp=0xc000051cd0 pc=0x48db7e
chatbot-ui-api-1      | github.com/go-skynet/LocalAI/api.(*galleryApplier).start.func1()
chatbot-ui-api-1      |         /build/api/gallery.go:57 +0xf7 fp=0xc000051fe0 sp=0xc000051e10 pc=0x93d6b7
chatbot-ui-api-1      | runtime.goexit()
chatbot-ui-api-1      |         /usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc000051fe8 sp=0xc000051fe0 pc=0x4ad1a1
chatbot-ui-api-1      | created by github.com/go-skynet/LocalAI/api.(*galleryApplier).start
chatbot-ui-api-1      |         /build/api/gallery.go:55 +0xaa
chatbot-ui-api-1      | 
chatbot-ui-api-1      | goroutine 20 [sleep]:
chatbot-ui-api-1      | runtime.gopark(0x399d5370efb?, 0xc00003af88?, 0xa5?, 0xd8?, 0xc0000bd930?)
chatbot-ui-api-1      |         /usr/local/go/src/runtime/proc.go:381 +0xd6 fp=0xc00003af58 sp=0xc00003af38 pc=0x47e396
chatbot-ui-api-1      | time.Sleep(0x2540be400)
chatbot-ui-api-1      |         /usr/local/go/src/runtime/time.go:195 +0x135 fp=0xc00003af98 sp=0xc00003af58 pc=0x4aa015
chatbot-ui-api-1      | github.com/valyala/fasthttp.(*workerPool).Start.func2()
chatbot-ui-api-1      |         /go/pkg/mod/github.com/valyala/fasthttp@v1.47.0/workerpool.go:67 +0x56 fp=0xc00003afe0 sp=0xc00003af98 pc=0x856016
chatbot-ui-api-1      | runtime.goexit()
chatbot-ui-api-1      |         /usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc00003afe8 sp=0xc00003afe0 pc=0x4ad1a1
chatbot-ui-api-1      | created by github.com/valyala/fasthttp.(*workerPool).Start
chatbot-ui-api-1      |         /go/pkg/mod/github.com/valyala/fasthttp@v1.47.0/workerpool.go:59 +0xdd
chatbot-ui-api-1      | 
chatbot-ui-api-1      | goroutine 6 [sleep]:
chatbot-ui-api-1      | runtime.gopark(0x39b9e71853d?, 0xb88a60?, 0xa0?, 0x25?, 0xc112240cfa32755e?)
chatbot-ui-api-1      |         /usr/local/go/src/runtime/proc.go:381 +0xd6 fp=0xc000052f88 sp=0xc000052f68 pc=0x47e396
chatbot-ui-api-1      | time.Sleep(0x3b9aca00)
chatbot-ui-api-1      |         /usr/local/go/src/runtime/time.go:195 +0x135 fp=0xc000052fc8 sp=0xc000052f88 pc=0x4aa015
chatbot-ui-api-1      | github.com/valyala/fasthttp.updateServerDate.func1()
chatbot-ui-api-1      |         /go/pkg/mod/github.com/valyala/fasthttp@v1.47.0/header.go:2247 +0x1e fp=0xc000052fe0 sp=0xc000052fc8 pc=0x856f9e
chatbot-ui-api-1      | runtime.goexit()
chatbot-ui-api-1      |         /usr/local/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc000052fe8 sp=0xc000052fe0 pc=0x4ad1a1
chatbot-ui-api-1      | created by github.com/valyala/fasthttp.updateServerDate
chatbot-ui-api-1      |         /go/pkg/mod/github.com/valyala/fasthttp@v1.47.0/header.go:2245 +0x25
chatbot-ui-api-1      | 
chatbot-ui-api-1      | rax    0x2de30f8
chatbot-ui-api-1      | rbx    0xffffffff
chatbot-ui-api-1      | rcx    0x20
chatbot-ui-api-1      | rdx    0x4c46d8c
chatbot-ui-api-1      | rdi    0x2de3168
chatbot-ui-api-1      | rsi    0x4000000
chatbot-ui-api-1      | rbp    0x7ffd5010f4a0
chatbot-ui-api-1      | rsp    0x7ffd5010dfe0
chatbot-ui-api-1      | r8     0x2de1c10
chatbot-ui-api-1      | r9     0x7fa47ee85be0
chatbot-ui-api-1      | r10    0x4000000
chatbot-ui-api-1      | r11    0x0
chatbot-ui-api-1      | r12    0x7ffd5010f520
chatbot-ui-api-1      | r13    0x2de1bf0
chatbot-ui-api-1      | r14    0x2de1c10
chatbot-ui-api-1      | r15    0x16
chatbot-ui-api-1      | rip    0xafaf64
chatbot-ui-api-1      | rflags 0x10246
chatbot-ui-api-1      | cs     0x33
chatbot-ui-api-1      | fs     0x0
chatbot-ui-api-1      | gs     0x0
chatbot-ui-api-1 exited with code 2
tobievii commented 1 year ago

Same here

2023-05-20 13:21:40 2023-05-20 13:21:40 ┌───────────────────────────────────────────────────┐ 2023-05-20 13:21:40 │ Fiber v2.45.0 │ 2023-05-20 13:21:40 │ http://127.0.0.1:8080 │ 2023-05-20 13:21:40 │ (bound on host 0.0.0.0 and port 8080) │ 2023-05-20 13:21:40 │ │ 2023-05-20 13:21:40 │ Handlers ............ 20 Processes ........... 1 │ 2023-05-20 13:21:40 │ Prefork ....... Disabled PID ................. 1 │ 2023-05-20 13:21:40 └───────────────────────────────────────────────────┘ 2023-05-20 13:21:40 2023-05-20 13:21:49 bert_load_from_file: loading model from '/models/bert' - please wait ... 2023-05-20 13:21:49 bert_load_from_file: n_vocab = 30522 2023-05-20 13:21:49 bert_load_from_file: n_max_tokens = 512 2023-05-20 13:21:49 bert_load_from_file: n_embd = 384 2023-05-20 13:21:49 bert_load_from_file: n_intermediate = 1536 2023-05-20 13:21:49 bert_load_from_file: n_head = 12 2023-05-20 13:21:49 bert_load_from_file: n_layer = 6 2023-05-20 13:21:49 bert_load_from_file: f16 = 2 2023-05-20 13:21:49 bert_load_from_file: ggml ctx size = 13.57 MB 2023-05-20 13:21:49 llama.cpp: loading model from /models/ggml-gpt4all-j 2023-05-20 13:21:50 failed unexpectedly reached end of filellama.cpp: loading model from /models/ggml-gpt4all-j 2023-05-20 13:21:50 bert_load_from_file: ............ done 2023-05-20 13:21:50 bert_load_from_file: model size = 13.55 MB / num tensors = 101 2023-05-20 13:21:50 bert_load_from_file: mem_per_token 452 KB, mem_per_input 248 MB 2023-05-20 13:21:50 loaded 2023-05-20 13:21:50 error loading model: unexpectedly reached end of file 2023-05-20 13:21:50 gptjllama_init_from_file: failed to load model 2023-05-20 13:21:50 LLAMA ERROR: failed to load model from /models/ggml-gpt4all-j 2023-05-20 13:21:50 mpt_model_load: loading model from '/models/ggml-gpt4all-j' - please wait ... 2023-05-20 13:21:50 mpt_model_load: invalid model file '/models/ggml-gpt4all-j' (bad magic) 2023-05-20 13:21:50 gptj_model_load: loading model from '/models/ggml-gpt4all-j' - please wait ... 2023-05-20 13:21:50 gptj_model_load: n_vocab = 50400 2023-05-20 13:21:50 gptj_model_load: n_ctx = 2048 2023-05-20 13:21:50 gptj_model_load: n_embd = 4096 2023-05-20 13:21:50 gptj_model_load: n_head = 16 2023-05-20 13:21:50 gptj_model_load: n_layer = 28 2023-05-20 13:21:50 gptj_model_load: n_rot = 64 2023-05-20 13:21:50 gptj_model_load: f16 = 2 2023-05-20 13:21:50 gptj_model_load: ggml ctx size = 5401.45 MB 2023-05-20 13:21:50 gptj_model_load: kv self size = 896.00 MB 2023-05-20 13:21:51 gptj_model_load: ..... done 2023-05-20 13:21:51 gptj_model_load: model size = 613.52 MB / num tensors = 44

zdeneksvarc commented 1 year ago

Still no solution?

Aisuko commented 1 year ago

Hi, guys. may you guys try the models in model gallery? https://github.com/go-skynet/model-gallery

kaitallaoua commented 1 year ago

Hi, guys. may you guys try the models in model gallery? https://github.com/go-skynet/model-gallery

I tried GPT4ALL-J following the model gallery guide and still have same issue, using the premade image. I also tried the latest avx linux binary and it did nothing, no output. Given the SIGILL: illegal instruction error, im guessing my cpu is lacking in support one way or another? I thought at least having avx support was enough, is it missing something else? This is what I have:

Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         46 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  32
  On-line CPU(s) list:   0-31
Vendor ID:               GenuineIntel
  Model name:            Intel(R) Xeon(R) CPU E5-2640 v2 @ 2.00GHz
    CPU family:          6
    Model:               62
    Thread(s) per core:  1
    Core(s) per socket:  16
    Socket(s):           2
    Stepping:            4
    BogoMIPS:            3999.99
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology cpuid tsc_known_freq
                          pni pclmulqdq vmx ssse3 cx16 pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm cpuid_fault pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept v
                         pid fsgsbase tsc_adjust smep erms xsaveopt arat umip md_clear arch_capabilities
Virtualization features: 
  Virtualization:        VT-x
  Hypervisor vendor:     KVM
  Virtualization type:   full
Caches (sum of all):     
  L1d:                   1 MiB (32 instances)
  L1i:                   1 MiB (32 instances)
  L2:                    128 MiB (32 instances)
  L3:                    32 MiB (2 instances)
NUMA:                    
  NUMA node(s):          1
  NUMA node0 CPU(s):     0-31
Vulnerabilities:         
  Itlb multihit:         Not affected
  L1tf:                  Mitigation; PTE Inversion; VMX flush not necessary, SMT disabled
  Mds:                   Mitigation; Clear CPU buffers; SMT Host state unknown
  Meltdown:              Mitigation; PTI
  Mmio stale data:       Unknown: No mitigations
  Retbleed:              Not affected
  Spec store bypass:     Mitigation; Speculative Store Bypass disabled via prctl and seccomp
  Spectre v1:            Mitigation; usercopy/swapgs barriers and __user pointer sanitization
  Spectre v2:            Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP disabled, RSB filling, PBRSB-eIBRS Not affected
  Srbds:                 Not affected
  Tsx async abort:       Not affected
zdeneksvarc commented 1 year ago

Hi, guys. may you guys try the models in model gallery? https://github.com/go-skynet/model-gallery

Sure, works well. Thank you for the curated collection of models 👍

> curl $LOCALAI/models/apply -H "Content-Type: application/json" -d '{ "url": "github:go-skynet/model-gallery/gpt4all-j.yaml", "name": "gpt4all-j" }'
> curl $LOCALAI/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "gpt4all-j",  "messages": [{"role": "user", "content": "How are you?"}], "temperature": 0.1 }' | jq

{
  "object": "chat.completion",
  "model": "gpt4all-j",
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "I'm doing well, thank you. How about yourself? \n"
      }
    }
  ],
  "usage": {
    "prompt_tokens": 0,
    "completion_tokens": 0,
    "total_tokens": 0
  }
}
rjchicago commented 1 year ago

Hi, guys. may you guys try the models in model gallery? https://github.com/go-skynet/model-gallery

Sure, works well. Thank you for the curated collection of models 👍

> curl $LOCALAI/models/apply -H "Content-Type: application/json" -d '{ "url": "github:go-skynet/model-gallery/gpt4all-j.yaml", "name": "gpt4all-j" }'
> curl $LOCALAI/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "gpt4all-j",  "messages": [{"role": "user", "content": "How are you?"}], "temperature": 0.1 }' | jq

{
  "object": "chat.completion",
  "model": "gpt4all-j",
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "I'm doing well, thank you. How about yourself? \n"
      }
    }
  ],
  "usage": {
    "prompt_tokens": 0,
    "completion_tokens": 0,
    "total_tokens": 0
  }
}

This work-around worked for me as well.

contentfree commented 1 year ago

If I do the above (on Apple M1) I get this:

Running the /models/apply command above (after restarting the server) results in:

Client:

❯ curl $LOCALAI/models/apply -H "Content-Type: application/json" -d '{ "url": "github:go-skynet/model-gallery/gpt4all-j.yaml", "name": "gpt4all-j" }'
{"uuid":"f9cee4ae-0608-11ee-89fc-beb7270c46f7","status":"http://localhost:8080/models/jobs/f9cee4ae-0608-11ee-89fc-beb7270c46f7"}%

Server:

[127.0.0.1]:50261  200  -  POST     /models/apply
10:30AM DBG Checking "ggml-gpt4all-j" exists and matches SHA

Running the next command:

Client:

❯ curl -s $LOCALAI/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "gpt4all-j",  "messages": [{"role": "user", "content": "How are you?"}], "temperature": 0.1 }' | jq
{
  "error": {
    "code": 500,
    "message": "could not load model - all backends returned error: 11 errors occurred:\n\t* failed loading model\n\t* failed loading model\n\t* failed loading model\n\t* failed loading model\n\t* failed loading model\n\t* failed loading model\n\t* failed loading model\n\t* failed loading model\n\t* failed loading model\n\t* failed loading model\n\t* failed loading model\n\n",
    "type": ""
  }
}

Server:

10:25AM DBG Request received: {"model":"gpt4all-j","file":"","language":"","response_format":"","size":"","prompt":null,"instruction":"","input":null,"stop":null,"messages":[{"role":"user","content":"How are you?"}],"stream":false,"echo":false,"top_p":0,"top_k":0,"temperature":0.1,"max_tokens":0,"n":0,"batch":0,"f16":false,"ignore_eos":false,"repeat_penalty":0,"n_keep":0,"mirostat_eta":0,"mirostat_tau":0,"mirostat":0,"frequency_penalty":0,"tfz":0,"seed":0,"mode":0,"step":0}
10:25AM DBG Parameter Config: &{OpenAIRequest:{Model:gpt4all-j File: Language: ResponseFormat: Size: Prompt:<nil> Instruction: Input:<nil> Stop:<nil> Messages:[] Stream:false Echo:false TopP:0.7 TopK:80 Temperature:0.1 Maxtokens:512 N:0 Batch:0 F16:false IgnoreEOS:false RepeatPenalty:0 Keep:0 MirostatETA:0 MirostatTAU:0 Mirostat:0 FrequencyPenalty:0 TFZ:0 Seed:0 Mode:0 Step:0} Name: StopWords:[] Cutstrings:[] TrimSpace:[] ContextSize:512 F16:false Threads:4 Debug:true Roles:map[] Embeddings:false Backend: TemplateConfig:{Completion: Chat: Edit:} MirostatETA:0 MirostatTAU:0 Mirostat:0 NGPULayers:0 MMap:false MMlock:false TensorSplit: MainGPU: ImageGenerationAssets: PromptCachePath: PromptCacheAll:false PromptCacheRO:false PromptStrings:[] InputStrings:[] InputToken:[]}
10:25AM DBG Loading model 'gpt4all-j' greedly
10:25AM DBG [llama] Attempting to load
10:25AM DBG Loading model llama from gpt4all-j
10:25AM DBG Loading model in memory from file: models/gpt4all-j
error loading model: failed to open models/gpt4all-j: No such file or directory
llama_init_from_file: failed to load model
10:25AM DBG [llama] Fails: failed loading model
10:25AM DBG [gpt4all] Attempting to load
10:25AM DBG Loading model gpt4all from gpt4all-j
10:25AM DBG Loading model in memory from file: models/gpt4all-j
load_gpt4all_model: error 'No such file or directory'
10:25AM DBG [gpt4all] Fails: failed loading model
10:25AM DBG [gptneox] Attempting to load
10:25AM DBG Loading model gptneox from gpt4all-j
10:25AM DBG Loading model in memory from file: models/gpt4all-j
gpt_neox_model_load: loading model from 'models/gpt4all-j' - please wait ...
gpt_neox_model_load: failed to open 'models/gpt4all-j'
gpt_neox_bootstrap: failed to load model from 'models/gpt4all-j'
10:25AM DBG [gptneox] Fails: failed loading model
10:25AM DBG [bert-embeddings] Attempting to load
10:25AM DBG Loading model bert-embeddings from gpt4all-j
10:25AM DBG Loading model in memory from file: models/gpt4all-j
bert_load_from_file: loading model from 'models/gpt4all-j' - please wait ...
bert_load_from_file: failed to open 'models/gpt4all-j'
bert_bootstrap: failed to load model from 'models/gpt4all-j'
10:25AM DBG [bert-embeddings] Fails: failed loading model
10:25AM DBG [gptj] Attempting to load
10:25AM DBG Loading model gptj from gpt4all-j
10:25AM DBG Loading model in memory from file: models/gpt4all-j
gptj_model_load: loading model from 'models/gpt4all-j' - please wait ...
gptj_model_load: failed to open 'models/gpt4all-j'
gptj_bootstrap: failed to load model from 'models/gpt4all-j'
10:25AM DBG [gptj] Fails: failed loading model
10:25AM DBG [gpt2] Attempting to load
10:25AM DBG Loading model gpt2 from gpt4all-j
10:25AM DBG Loading model in memory from file: models/gpt4all-j
gpt2_model_load: loading model from 'models/gpt4all-j'
gpt2_model_load: failed to open 'models/gpt4all-j'
gpt2_bootstrap: failed to load model from 'models/gpt4all-j'
10:25AM DBG [gpt2] Fails: failed loading model
10:25AM DBG [dolly] Attempting to load
10:25AM DBG Loading model dolly from gpt4all-j
10:25AM DBG Loading model in memory from file: models/gpt4all-j
dollyv2_model_load: loading model from 'models/gpt4all-j' - please wait ...
dollyv2_model_load: failed to open 'models/gpt4all-j'
dolly_bootstrap: failed to load model from 'models/gpt4all-j'
10:25AM DBG [dolly] Fails: failed loading model
10:25AM DBG [falcon] Attempting to load
10:25AM DBG Loading model falcon from gpt4all-j
10:25AM DBG Loading model in memory from file: models/gpt4all-j
falcon_model_load: loading model from 'models/gpt4all-j' - please wait ...
falcon_model_load: failed to open 'models/gpt4all-j'
falcon_bootstrap: failed to load model from 'models/gpt4all-j'
10:25AM DBG [falcon] Fails: failed loading model
10:25AM DBG [mpt] Attempting to load
10:25AM DBG Loading model mpt from gpt4all-j
10:25AM DBG Loading model in memory from file: models/gpt4all-j
mpt_model_load: loading model from 'models/gpt4all-j' - please wait ...
mpt_model_load: failed to open 'models/gpt4all-j'
mpt_bootstrap: failed to load model from 'models/gpt4all-j'
10:25AM DBG [mpt] Fails: failed loading model
10:25AM DBG [replit] Attempting to load
10:25AM DBG Loading model replit from gpt4all-j
10:25AM DBG Loading model in memory from file: models/gpt4all-j
replit_model_load: loading model from 'models/gpt4all-j' - please wait ...
replit_model_load: failed to open 'models/gpt4all-j'
replit_bootstrap: failed to load model from 'models/gpt4all-j'
10:25AM DBG [replit] Fails: failed loading model
10:25AM DBG [starcoder] Attempting to load
10:25AM DBG Loading model starcoder from gpt4all-j
10:25AM DBG Loading model in memory from file: models/gpt4all-j
starcoder_model_load: loading model from 'models/gpt4all-j'
starcoder_model_load: failed to open 'models/gpt4all-j'
starcoder_bootstrap: failed to load model from 'models/gpt4all-j'
10:25AM DBG [starcoder] Fails: failed loading model
[127.0.0.1]:50174  500  -  POST     /v1/chat/completions

This error error loading model: failed to open models/gpt4all-j: No such file or directory is accurate… the model I've got is ggml-gpt4all-j.

Narcolapser commented 1 year ago

Hi, guys. may you guys try the models in model gallery? https://github.com/go-skynet/model-gallery

Sure, works well. Thank you for the curated collection of models +1

> curl $LOCALAI/models/apply -H "Content-Type: application/json" -d '{ "url": "github:go-skynet/model-gallery/gpt4all-j.yaml", "name": "gpt4all-j" }'
> curl $LOCALAI/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "gpt4all-j",  "messages": [{"role": "user", "content": "How are you?"}], "temperature": 0.1 }' | jq

{
  "object": "chat.completion",
  "model": "gpt4all-j",
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "I'm doing well, thank you. How about yourself? \n"
      }
    }
  ],
  "usage": {
    "prompt_tokens": 0,
    "completion_tokens": 0,
    "total_tokens": 0
  }
}

This work-around worked for me as well.

It doesn't seem to be working for me:

curl $LOCALAI/models/apply -H "Content-Type: application/json" -d '{ "url": "github:go-skynet/model-gallery/gpt4all-j.yaml", "name": "gpt4all-j" }'

{"uuid":"b73bf500-07ef-11ee-b26e-0242ac1a0002","status":"http://localhost:5006/models/jobs/b73bf500-07ef-11ee-b26e-0242ac1a0002"}

curl http://localhost:5006/models/jobs/b73bf500-07ef-11ee-b26e-0242ac1a0002

{"error":null,"processed":false,"message":"processing"}

I have otherwise followed the getting started with GPT4All-J guide. I followed it to the end, the first curl command, the one to get the list of models, working perfectly the second one causing the error in this issue. I restart the container, ran models curl command again to ensure everything is working, then I ran:

export LOCALAI=http://localhost:8080
curl $LOCALAI/models/apply -H "Content-Type: application/json" -d '{ "url": "github:go-skynet/model-gallery/gpt4all-j.yaml", "name": "gpt4all-j" }'
> {"uuid":"9a348d50-07f3-11ee-a158-0242ac1a0002","status":"http://localhost:8080/models/jobs/9a348d50-07f3-11ee-a158-0242ac1a0002"}

curl http://localhost:8080/models/jobs/9a348d50-07f3-11ee-a158-0242ac1a0002
> {"error":null,"processed":true,"message":"completed"}

curl $LOCALAI/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "gpt4all-j",  "messages": [{"role": "user", "content": "How are you?"}], "temperature": 0.1 }' | jq
>  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
>                                 Dload  Upload   Total   Spent    Left  Speed
> 100   104    0     0  100   104      0   6100 --:--:-- --:--:-- --:--:--  6500
> curl: (52) Empty reply from server

Looking in the logs I now get this error:

SIGILL: illegal instruction
PC=0xb57885 m=3 sigcode=2
signal arrived during cgo execution
instruction bytes: 0xc4 0xe3 0x7d 0x39 0x84 0x24 0x18 0x2 0x0 0x0 0x1 0x48 0xc7 0x84 0x24 0xf8

Did I miss a step or something?

cstuart1310 commented 1 year ago

Hi, guys. may you guys try the models in model gallery? https://github.com/go-skynet/model-gallery

Sure, works well. Thank you for the curated collection of models +1

> curl $LOCALAI/models/apply -H "Content-Type: application/json" -d '{ "url": "github:go-skynet/model-gallery/gpt4all-j.yaml", "name": "gpt4all-j" }'
> curl $LOCALAI/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "gpt4all-j",  "messages": [{"role": "user", "content": "How are you?"}], "temperature": 0.1 }' | jq

{
  "object": "chat.completion",
  "model": "gpt4all-j",
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "I'm doing well, thank you. How about yourself? \n"
      }
    }
  ],
  "usage": {
    "prompt_tokens": 0,
    "completion_tokens": 0,
    "total_tokens": 0
  }
}

This work-around worked for me as well.

It doesn't seem to be working for me:

curl $LOCALAI/models/apply -H "Content-Type: application/json" -d '{ "url": "github:go-skynet/model-gallery/gpt4all-j.yaml", "name": "gpt4all-j" }'

{"uuid":"b73bf500-07ef-11ee-b26e-0242ac1a0002","status":"http://localhost:5006/models/jobs/b73bf500-07ef-11ee-b26e-0242ac1a0002"}

curl http://localhost:5006/models/jobs/b73bf500-07ef-11ee-b26e-0242ac1a0002

{"error":null,"processed":false,"message":"processing"}

I have otherwise followed the getting started with GPT4All-J guide. I followed it to the end, the first curl command, the one to get the list of models, working perfectly the second one causing the error in this issue. I restart the container, ran models curl command again to ensure everything is working, then I ran:

export LOCALAI=http://localhost:8080
curl $LOCALAI/models/apply -H "Content-Type: application/json" -d '{ "url": "github:go-skynet/model-gallery/gpt4all-j.yaml", "name": "gpt4all-j" }'
> {"uuid":"9a348d50-07f3-11ee-a158-0242ac1a0002","status":"http://localhost:8080/models/jobs/9a348d50-07f3-11ee-a158-0242ac1a0002"}

curl http://localhost:8080/models/jobs/9a348d50-07f3-11ee-a158-0242ac1a0002
> {"error":null,"processed":true,"message":"completed"}

curl $LOCALAI/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "gpt4all-j",  "messages": [{"role": "user", "content": "How are you?"}], "temperature": 0.1 }' | jq
>  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
>                                 Dload  Upload   Total   Spent    Left  Speed
> 100   104    0     0  100   104      0   6100 --:--:-- --:--:-- --:--:--  6500
> curl: (52) Empty reply from server

Looking in the logs I now get this error:

SIGILL: illegal instruction
PC=0xb57885 m=3 sigcode=2
signal arrived during cgo execution
instruction bytes: 0xc4 0xe3 0x7d 0x39 0x84 0x24 0x18 0x2 0x0 0x0 0x1 0x48 0xc7 0x84 0x24 0xf8

Did I miss a step or something?

It's because of some instruction set not being compatible with your CPU or something idk I don't fully understand it but AVX is a key word lol. Try building locally using this command, but it does make it incredibly slow CMAKE_ARGS="-DLLAMA_F16C=OFF -DLLAMA_AVX512=OFF -DLLAMA_AVX2=OFF -DLLAMA_FMA=OFF" make build

See: https://github.com/go-skynet/LocalAI/issues/288#issuecomment-1580305158

antoniopresto commented 1 year ago

In my case the problem was just the name of the model. Removing the .bin extension solved it: ggml-gpt4all-j.bin-> ggml-gpt4all-j 😅

1-Felix commented 1 year ago

Im my the problem was just the name of the model. Removing the .bin extension solved it: ggml-gpt4all-j.bin-> ggml-gpt4all-j 😅

This worked for me too. I'm on Windows running the docker container

localai-bot commented 1 year ago

:warning::warning::warning::warning::warning:

Hi! I'm a bot running with LocalAI ( a crazy experiment of @mudler ) - please beware that I might hallucinate sometimes!

_but.... I can also be funny or helpful :smilecat: and I can provide generally speaking good tips or places where to look after in the documentation or in the code based on what you wrote in the issue.

Don't engage in conversation with me, I don't support (yet) replying!

:warning::warning::warning::warning::warning:

The error message suggests that there might be an issue with the model file or the way it is loaded by LocalAI. It could also be related to the version of the model or the image used. To troubleshoot this issue, you can try the following steps:

  1. Check if the model file is correctly downloaded and stored in the /models directory. You can verify this by checking the contents of the directory using the kubectl exec command.
  2. Ensure that the image used to run LocalAI is up-to-date and compatible with the model file. You can check the version of the image using the docker image command or the kubectl describe command for the pod.
  3. Check if the environment variables are correctly set in the YAML file. Ensure that the PRELOAD_MODELS variable is properly formatted and contains the correct URL to the model file.
  4. Try using a different model file or version of the image to see if the issue persists.
  5. If the issue still occurs, you can try filing an issue on the LocalAI GitHub repository or contacting the maintainers for further assistance.

Sources:

Aisuko commented 11 months ago

I am going to close this issue. Welcome to open if you still have same issue.