Open jchalex opened 1 year ago
I think the process is hanging on ssr with trpc trying to start the llama.cpp server. What environment variables are you using?
Can you also try and set NODE_ENV=development
this should enable some debug logging
NODE_ENV=development HOST=0.0.0.0 pnpm run dev
but there is no log either.
It seems that setWSState(ClientWSState.READY)
only will be invoked in Generate.tsx, and then I log in api.llama.subscription.useSubscription, not only the onData but also the onError, but, there is no log after starting and accessing.
Can you try pulling the latest version and run that? (you will need to run pnpm install
and pnpm build
again)
And can you also try running ./bin/main -l 3001 -m <path-to-ggml-model.bin>
and see if the llama.cpp tcp server will start?
Now, I can access the web with the latest code. But after submitting a question, got error as following:
...
[llama-tcp] start(): error: connect ECONNREFUSED ::1:61841 , retrying in 1s ( 2 /50)
..... done
llama_model_load: model size = 4017.27 MB / num tensors = 291
system_info: n_threads = 16 / 16 | AVX = 1 | AVX2 = 1 | AVX512 = 1 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 |
Listening on 127.0.0.1:61841
[llama-tcp] _createConnection(): error: connect ECONNREFUSED ::1:61841
[llama-tcp] start(): error: connect ECONNREFUSED ::1:61841 , retrying in 1s ( 3 /50)
[17:29:24 UTC] INFO: incoming request
reqId: "req-g"
req: {
"method": "GET",
"url": "/_next/webpack-hmr",
"hostname": "192.168.1.101:3000",
"remoteAddress": "192.168.1.103",
"remotePort": 31540
}
[17:29:24 UTC] INFO: request completed
reqId: "req-g"
res: {
"statusCode": 404
}
responseTime: 18.824049949645996
[llama-tcp] _createConnection(): error: connect ECONNREFUSED ::1:61841
[llama-tcp] start(): error: connect ECONNREFUSED ::1:61841 , retrying in 1s ( 4 /50)
[llama-tcp] _createConnection(): error: connect ECONNREFUSED ::1:61841
[llama-tcp] start(): error: connect ECONNREFUSED ::1:61841 , retrying in 1s ( 5 /50)
[llama-tcp] _createConnection(): error: connect ECONNREFUSED ::1:61841
[llama-tcp] start(): error: connect ECONNREFUSED ::1:61841 , retrying in 1s ( 6 /50)
[llama-tcp] _createConnection(): error: connect ECONNREFUSED ::1:61841
[llama-tcp] start(): error: connect ECONNREFUSED ::1:61841 , retrying in 1s ( 7 /50)
[llama-tcp] _createConnection(): error: connect ECONNREFUSED ::1:61841
[llama-tcp] start(): error: connect ECONNREFUSED ::1:61841 , retrying in 1s ( 8 /50)
[llama-tcp] _createConnection(): error: connect ECONNREFUSED ::1:61841
[llama-tcp] start(): error: connect ECONNREFUSED ::1:61841 , retrying in 1s ( 9 /50)
[17:29:29 UTC] INFO: incoming request
reqId: "req-h"
req: {
"method": "GET",
"url": "/_next/webpack-hmr",
"hostname": "192.168.1.101:3000",
"remoteAddress": "192.168.1.103",
"remotePort": 31592
}
...
HOST=0.0.0.0 pnpm
with start command
HOST=0.0.0.0 pnpm start
If I use
./bin/main -l 3001 -m <path-to-ggml-model.bin>
the llama start
...
system_info: n_threads = 16 / 16 | AVX = 1 | AVX2 = 1 | AVX512 = 1 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 |
Listening on 127.0.0.1:3001
and then, change .env
# The web-server's HOST and PORT
HOST=127.0.0.1
PORT=3000
# Disable this if you want to run llama.cpp#tcp_server on your own
# Uses the binary path set below
USE_BUILT_IN_LLAMA_SERVER=false
# Binary location for llama.cpp#tcp_server
# Auto will automatically pull and build the latest version
# Requires build-essential (or equivalent) and make
# This does nothing if USE_BUILT_IN_LLAMA_SERVER is disabled
LLAMA_TCP_BIN=auto
# If USE_BUILT_IN_LLAMA_SERVER is disabled, enter the llama.cpp#tcp_server tcp details here
# Otherwise, this app will automatically start a llama.cpp#tcp_server server
# If port is set to auto, it will listen on a random open port
LLAMA_SERVER_HOST=127.0.0.1
LLAMA_SERVER_PORT=3001
# The path to a model's .bin file
LLAMA_MODEL_PATH=/data/github.com/ggerganov/llama.cpp/models/7B/ggml-model-q4_0.bin
start
HOST=0.0.0.0 pnpm start
or
HOST=0.0.0.0 pnpm run dev
got the same error as
# HOST=0.0.0.0 pnpm start
> llama-playground@1.0.0 start /data/github.com/ItzDerock/llama-playground
> node --enable-source-maps ./dist/index.js
❌ Invalid environment variables: { LLAMA_SERVER_PORT: [ 'Invalid input' ] }
/data/github.com/ItzDerock/llama-playground/src/env.mjs:90
}
^
Error: Invalid environment variables
at Object.<anonymous> (/data/github.com/ItzDerock/llama-playground/src/env.mjs:90:1)
at Module._compile (node:internal/modules/cjs/loader:1226:14)
at Module._extensions..js (node:internal/modules/cjs/loader:1280:10)
at Module.load (node:internal/modules/cjs/loader:1089:32)
at Module._load (node:internal/modules/cjs/loader:930:12)
at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:81:12)
at node:internal/main/run_main_module:23:47
# HOST=0.0.0.0 pnpm run dev
> llama-playground@1.0.0 dev /data/github.com/ItzDerock/llama-playground
> npm-run-all build:server dev:run
> llama-playground@1.0.0 build:server /data/github.com/ItzDerock/llama-playground
> tsup
CLI Building entry: src/server/index.ts
CLI Using tsconfig: tsconfig.json
CLI tsup v6.7.0
CLI Using tsup config: /data/github.com/ItzDerock/llama-playground/tsup.config.ts
CLI Target: es2017
CLI Cleaning output folder
CJS Build start
CJS dist/index.js 17.75 KB
CJS dist/index.js.map 39.69 KB
CJS ⚡️ Build success in 155ms
> llama-playground@1.0.0 dev:run /data/github.com/ItzDerock/llama-playground
> cross-env DEBUG=true REACT_EDITOR=code NODE_ENV=development RECOIL_DUPLICATE_ATOM_KEY_CHECKING_ENABLED=false node --enable-source-maps dist
❌ Invalid environment variables: { LLAMA_SERVER_PORT: [ 'Invalid input' ] }
/data/github.com/ItzDerock/llama-playground/src/env.mjs:90
}
^
Error: Invalid environment variables
at Object.<anonymous> (/data/github.com/ItzDerock/llama-playground/src/env.mjs:90:1)
at Module._compile (node:internal/modules/cjs/loader:1226:14)
at Module._extensions..js (node:internal/modules/cjs/loader:1280:10)
at Module.load (node:internal/modules/cjs/loader:1089:32)
at Module._load (node:internal/modules/cjs/loader:930:12)
at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:81:12)
at node:internal/main/run_main_module:23:47
Can you try using the following env:
# The web-server's HOST and PORT
HOST=127.0.0.1
PORT=3000
# Disable this if you want to run llama.cpp#tcp_server on your own
# Uses the binary path set below
USE_BUILT_IN_LLAMA_SERVER=true
# Binary location for llama.cpp#tcp_server
# Auto will automatically pull and build the latest version
# Requires build-essential (or equivalent) and make
# This does nothing if USE_BUILT_IN_LLAMA_SERVER is disabled
LLAMA_TCP_BIN=auto
# If USE_BUILT_IN_LLAMA_SERVER is disabled, enter the llama.cpp#tcp_server tcp details here
# Otherwise, this app will automatically start a llama.cpp#tcp_server server
# If port is set to auto, it will listen on a random open port
LLAMA_SERVER_HOST=127.0.0.1
LLAMA_SERVER_PORT=auto
# The path to a model's .bin file
LLAMA_MODEL_PATH=/data/github.com/ggerganov/llama.cpp/models/7B/ggml-model-q4_0.bin
Also just pushed an update to main that should fix your Invalid input
error.
just wait a 20+min