keldenl / gpt-llama.cpp

A llama.cpp drop-in replacement for OpenAI's GPT endpoints, allowing GPT-powered apps to run off local llama.cpp models instead of OpenAI.
MIT License
595 stars 67 forks source link

Problems on linux #12

Open atisharma opened 1 year ago

atisharma commented 1 year ago

First problem is that port 443 is usually reserved. I edited index.js to 8080.

Next problem is that it crashes on first request:

/src/gpt-llama.cpp > npm start

> gpt-llama.cpp@0.1.9 start
> node index.js

Server is listening on:
  - localhost:8080
  - 192.168.1.176:8080 (for other devices on the same network)
node:internal/errors:478
    ErrorCaptureStackTrace(err);
    ^

Error: spawn ENOTDIR
    at ChildProcess.spawn (node:internal/child_process:420:11)
    at spawn (node:child_process:733:9)
    at file:///src/gpt-llama.cpp/routes/chatRoutes.js:155:25
    at Layer.handle [as handle_request] (/src/gpt-llama.cpp/node_modules/express/lib/router/layer.js:95:5)
    at next (/src/gpt-llama.cpp/node_modules/express/lib/router/route.js:144:13)
    at Route.dispatch (/src/gpt-llama.cpp/node_modules/express/lib/router/route.js:114:3)
    at Layer.handle [as handle_request] (/src/gpt-llama.cpp/node_modules/express/lib/router/layer.js:95:5)
    at /src/gpt-llama.cpp/node_modules/express/lib/router/index.js:284:15
    at Function.process_params (/src/gpt-llama.cpp/node_modules/express/lib/router/index.js:346:12)
    at next (/src/gpt-llama.cpp/node_modules/express/lib/router/index.js:280:10) {
  errno: -20,
  code: 'ENOTDIR',
  syscall: 'spawn'
}
keldenl commented 1 year ago

did you properly set up your path to your model in your auth? how are you sending your request?

atisharma commented 1 year ago

Yes, I believe so. I am testing with curl.

curl --location --request POST 'http://localhost:8080/v1/chat/completions' --header 'Authorization: Bearer /sol/ll
m/LLaMA/7B/ggml-model-q4_0.bin' --header 'Content-Type: application/json' --data-raw '{
 "model": "gpt-3.5-turbo",
   "messages": [
      {
         "role": "system",
         "content": "You are ChatGPT, a helpful assistant developed by OpenAI."
      },
      {
         "role": "user",
         "content": "How are you doing today?"
      }
   ]
}'
curl: (52) Empty reply from server

and

$ ls /sol/llm/LLaMA/7B/ggml-model-q4_0.bin
/sol/llm/LLaMA/7B/ggml-model-q4_0.bin
cryptocake commented 1 year ago

Regarding the port 443 being reserved for system processes, you could do the following:

sudo setcap cap_net_bind_service=+ep `readlink -f \`which node\``

and then npm start

keldenl commented 1 year ago

@atisharma can you try the new ./test-installation.sh script to help validate if you're giving it the right path?

th-neu commented 1 year ago

16

Should fix the whole Port issue. I have similar changes in a dev branch and it works fine

atisharma commented 1 year ago

Still curl: (52) Empty reply from server. Your test script also hardcodes port 443 by the way.

keldenl commented 1 year ago

good point @atisharma , let me update the script to be more flexible. and @th-neu , i'll take a quick look at your PR.

for now, @atisharma , what do u get if you change the script to your port? does the curl command return the proper value?

atisharma commented 1 year ago

No, the empty reply was with using 8080 in both index.js and the test script.

keldenl commented 1 year ago

i just updated the test script and added support for port – PORT=8080 npm run start should do the trick. are you still running into the same issue?

keldenl commented 1 year ago

@atisharma are you still running into the same issue?