withcatai / node-llama-cpp

Run AI models locally on your machine with node.js bindings for llama.cpp. Force a JSON schema on the model output on the generation level
https://withcatai.github.io/node-llama-cpp/
MIT License
760 stars 65 forks source link

Fail to run in docker image #160

Closed igorjos closed 4 months ago

igorjos commented 5 months ago

Issue description

Undescriptive error in docker image

Expected Behavior

Node version of llama.cpp to work when dockerized. Node alpine-v18

Actual Behavior

Example for node provided in the repo MD file, when put in docker container it is not working. When same code is executed locally everything works, when it is set through the docker it is not working.

Docker file:

FROM node:18-alpine as install
LABEL state=llamainstall

WORKDIR /build

# Install dependencies
COPY . .
RUN npm i --omit=dev

FROM node:18-alpine as release
WORKDIR /app

# Copy the app
COPY --from=install /build .

RUN rm -rf /build

ARG model_path

ENV HOST=0.0.0.0
ENV PORT=9889
ENV NODE_ENV=production
ENV MODEL_PATH=${model_path}
ENV DATABASE_URL=''

# Expose the app port
EXPOSE ${PORT}

# Start the app
CMD npm start

Error returned back

npm ERR! path /app
npm ERR! command failed
npm ERR! signal SIGSEGV
npm ERR! command sh -c node test.js

To the docker container I mount the model file properly, also it has the proper access and permissions to the model file, but the app won't run.

If I disable the llama script and just run everything else as is, it will work properly.

The place where it fails:

 new LlamaModel({
        modelPath: __dirname
    });

Steps to reproduce

My Environment

Dependency Version
Operating System
CPU Apple M1 and Intel Core i5 9th gen
Node.js version 18.0
Typescript version not used
node-llama-cpp version current/latest

Additional Context

No response

Relevant Features Used

Are you willing to resolve this issue by submitting a Pull Request?

No, I don’t have the time and I’m okay to wait for the community / maintainers to resolve this issue.

mrddter commented 5 months ago

Just out of curiosity, have you tried using the Docker version of node:18 (non-alpine)?

giladgd commented 5 months ago

@igorjos From the details you provided, I can easily spot this issue: modelPath: __dirname - you try to load a folder instead of providing a path to a specific model.

Also, please provide the specific node-llama-cpp version you use

wiwiwuwuwa commented 5 months ago

@igorjos, I resolved this issue by using Ubuntu distro instead of Alpine. Perhaps this is related to prebuild binaries.

igorjos commented 5 months ago

Hi @giladgd I'm using v2.8.6, the path is correct within the docker, also accessible, I tried to run simple operations e.g file exist, file open etc, it worked well, but the llama fails to open it.

@mrddter I did tried with full node without alpine as well, it didn't worked. Same result with node-20, node-22

@wiwiwuwuwa I didn't tried full ubuntu, but that is big overhead in this case, instead of having 200MB image size + the model from external path, I will have 10GB+ image. But I agree is one of the possible solutions.

Additionally i tried to increase the memory used and swap memory to the docker image, to use 24GB ram and 200GB swap, but that didn't helped as well.

I can't find anything in the logs that might point to the issue, I just get the error mentioned in my original comment.

My assumption would be that it requires the C libraries to be installed to be able to run, as per @wiwiwuwuwa comment - if it works with Ubuntu image.

giladgd commented 4 months ago

@igorjos I found the issue and included the fix in #175, I'll release it in the next beta version.

From my investigation, there's no benefit to using Alpine Linux together with a GPU, as it only makes it more complicated to get everything to work well with GPU drivers due to linker differences and some other differences in this distro. Also, since running an LLM is not a lightweight task, having a more robust image with optimizations targeted for the long run and not only to startup times contributes to better performance overall, so it would be a better choice.

I recommend you to use the default node image without -alpine, for example node:20 or node:lts.

When version 3.0 is officially released, I'll include more in-depth explanations and examples of how to run node-llama-cpp in a docker image.

github-actions[bot] commented 4 months ago

:tada: This issue has been resolved in version 3.0.0-beta.13 :tada:

The release is available on:

Your semantic-release bot :package::rocket: