Issue running LLama.Web in linux container - OSX M1

noli44 commented 1 year ago

I am running in osx on an M1 and have been able to run the Web project without any issues. The problem arises when it's run within a container.

I see the following when using the build runtimes and if I build llama.cpp

LLama.Native.NativeApi.llama_max_devices()
LLama.Abstractions.TensorSplitsCollection..ctor() in IModelParams.cs
LLama.Web.Common.ModelOptions..ctor() in ModelOptions.cs

Dockerfile in LLama.Web project

FROM ubuntu:22.04 AS build-llama

RUN apt-get update \
    && apt-get upgrade -y \
    && apt-get install -y build-essential git wget cmake

RUN git clone https://github.com/ggerganov/llama.cpp.git && cd llama.cpp && mkdir build && cd build

WORKDIR /llama.cpp/build

RUN cmake .. -DLLAMA_STATIC=Off -DBUILD_SHARED_LIBS=On && make

# https://hub.docker.com/_/microsoft-dotnet
FROM mcr.microsoft.com/dotnet/sdk:7.0 AS build
WORKDIR /source

# copy csproj and restore as distinct layers
COPY LLama.Web/LLama.Web.csproj ./LLama.Web/
COPY LLama/LLamaSharp.csproj /LLama
RUN dotnet restore LLama.Web/LLama.Web.csproj

# copy everything else and build app
COPY . .
WORKDIR /source/LLama.Web
RUN dotnet build LLama.Web.csproj -c Release -o /app/build

FROM build AS publish
RUN dotnet publish LLama.Web.csproj -c Release -o /app --no-restore

# final stage/image
FROM mcr.microsoft.com/dotnet/aspnet:7.0
WORKDIR /app
COPY --from=publish /app ./
COPY --from=build-llama /llama.cpp/build ./runtimes/build/llama.cpp
ENTRYPOINT ["dotnet", "LLama.Web.dll"]

compose.yml in root directory. I modified the code/config to look for models in the mounted local volume.

services:
  web:
    container_name: llamasharp-web
    image: llamasharp/web
    # platform: linux/amd64
    volumes:
      - ./LLMModels:/app/LLMModels
    build:
      context: .
      dockerfile: LLama.Web/Dockerfile
    restart: always
    environment: 
      ASPNETCORE_ENVIRONMENT: Development
    ports:
      - "8002:80"

Any help is appreciated.

SignalRT commented 1 year ago

If you don´t mind I would like to understand your use case.

You have a ARM Mac and want to execute a docker with linux/amd64 platform so I suppose you install docker for Mac with Apple silicon version (https://docs.docker.com/desktop/install/mac-install/), and I suppose (I never tried) that docker will require Rosetta 2 because you are not using linux/arm64.

Are you expecting to run the application in linux/amd64 in CPU only?

You are making this just as a test or you will deploy the final container in other platforms expecting to execute on GPU?

noli44 commented 1 year ago

Hi,

Sorry that compose file is a little misleading. I was attempting to run on amd64 to see if I could isolate the issue to arm64, but found the problem regardless.

It's commented out so would be would be using the default arm64 image.

At the moment I want to be able to run locally in a container (arm64), but future state will likely be linux/amd64 on cloud infrastructure.

Thanks

SignalRT commented 12 months ago

I will try to run it in a container. I will give you feedback once I get some conclusions.

SciSharp / LLamaSharp

Issue running LLama.Web in linux container - OSX M1 #342