mudler / LocalAI

:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. Features: Generate Text, Audio, Video, Images, Voice Cloning, Distributed inference
https://localai.io
MIT License
23.21k stars 1.76k forks source link

Mac os native build not working #1560

Open glebnaz opened 8 months ago

glebnaz commented 8 months ago

LocalAI version: v2.4.1

Environment, CPU architecture, OS, and Version: MBP 14 M1 PRO

Describe the bug Not working make build and make BUILD_TYPE=metal build

To Reproduce run in local make BUILD_TYPE=metal build (like in instruction)

Expected behavior Error

Logs

CGO_LDFLAGS=" -lcblas -framework Accelerate -framework Foundation -framework Metal -framework MetalKit -framework MetalPerformanceShaders" C_INCLUDE_PATH=/Users/glebnaz/Documents/#main/workspace/LocalAI/sources/go-ggml-transformers LIBRARY_PATH=/Users/glebnaz/Documents/#main/workspace/LocalAI/sources/go-ggml-transformers \
    go build -ldflags " -X "github.com/go-skynet/LocalAI/internal.Version=v2.4.1-2-g62a02cd" -X "github.com/go-skynet/LocalAI/internal.Commit=62a02cd1feb7cf8a75bcbe253fb95f204f022c1f"" -tags "" -o backend-assets/grpc/falcon-ggml ./backend/go/llm/falcon-ggml/
backend/go/llm/transformers/dolly.go:11:2: falcon.go: malformed #cgo argument: -I/Users/glebnaz/Documents/#main/workspace/LocalAI/sources/go-ggml-transformers/ggml.cpp/include/
make: *** [backend-assets/grpc/falcon-ggml] Error 1

Additional context

themeaningofmeaning commented 8 months ago

Interesting, I'm getting the same error as well with multiple build attempts.

LocalAI version: v2.5.1

Environment, CPU architecture, OS, and Version: MBP 16 M1 PRO

I've tried using Docker to build this after cloning the repo and it shows "unable to locate package conda" even though the build instructions do specify conda as a required dependency. I also ran a variety of make prepare build attempts and it throws an error about golang not existing even though the mentioned gpt4all-bindings/golang does in fact exist and contains the necessary dependencies. Strange.

  1. Docker log:
    
    1.254 E: Unable to locate package conda
    ------
    Dockerfile:60
    --------------------
    59 |     
    60 | >>> RUN curl https://repo.anaconda.com/pkgs/misc/gpgkeys/anaconda.asc | gpg --dearmor > conda.gpg && \
    61 | >>>     install -o root -g root -m 644 conda.gpg /usr/share/keyrings/conda-archive-keyring.gpg && \
    62 | >>>     gpg --keyring /usr/share/keyrings/conda-archive-keyring.gpg --no-default-keyring --fingerprint 34161F5BF5EB1D4BFBBB8F0A8AEB4F8B29D82806 && \
    63 | >>>     echo "deb [arch=amd64 signed-by=/usr/share/keyrings/conda-archive-keyring.gpg] https://repo.anaconda.com/pkgs/misc/debrepo/conda stable main" > /etc/apt/sources.list.d/conda.list && \
    64 | >>>     echo "deb [arch=amd64 signed-by=/usr/share/keyrings/conda-archive-keyring.gpg] https://repo.anaconda.com/pkgs/misc/debrepo/conda stable main" | tee -a /etc/apt/sources.list.d/conda.list && \
    65 | >>>     apt-get update && \
    66 | >>>     apt-get install -y conda && apt-get clean
    67 |     
    --------------------
    ERROR: failed to solve: process "/bin/sh -c curl https://repo.anaconda.com/pkgs/misc/gpgkeys/anaconda.asc | gpg --dearmor > conda.gpg &&     install -o root -g root -m 644 conda.gpg /usr/share/keyrings/conda-archive-keyring.gpg &&     gpg --keyring /usr/share/keyrings/conda-archive-keyring.gpg --no-default-keyring --fingerprint 34161F5BF5EB1D4BFBBB8F0A8AEB4F8B29D82806 &&     echo \"deb [arch=amd64 signed-by=/usr/share/keyrings/conda-archive-keyring.gpg] https://repo.anaconda.com/pkgs/misc/debrepo/conda stable main\" > /etc/apt/sources.list.d/conda.list &&     echo \"deb [arch=amd64 signed-by=/usr/share/keyrings/conda-archive-keyring.gpg] https://repo.anaconda.com/pkgs/misc/debrepo/conda stable main\" | tee -a /etc/apt/sources.list.d/conda.list &&     apt-get update &&     apt-get install -y conda && apt-get clean" did not complete successfully: exit code: 100

View build details: docker-desktop://dashboard/build/desktop-linux/desktop-linux/zasirx96yr7cszlyhxyn4ddi4


2. `make prepare` log:

Switched to a new branch 'build' touch get-sources go mod edit -replace github.com/nomic-ai/gpt4all/gpt4all-bindings/golang=/Users/meaning/Desktop/uploading d1/LocalAI/sources/gpt4all/gpt4all-bindings/golang go: open d1/LocalAI/sources/gpt4all/gpt4all-bindings/golang: no such file or directory make: *** [replace] Error 1

muellest commented 8 months ago

@themeaningofmeaning: Unfortunately I get the same error =(. Any news on this?

LocalAI version: v.2.3.1 v.2.4.0 v2.5.1

Environment, CPU architecture, OS, and Version: MBP 14 M1 PRO

Logs

failed to solve: executor failed running [/bin/sh -c curl https://repo.anaconda.com/pkgs/misc/gpgkeys/anaconda.asc | gpg --dearmor > conda.gpg && install -o root -g root -m 644 conda.gpg /usr/share/keyrings/conda-archive-keyring.gpg && gpg --keyring /usr/share/keyrings/conda-archive-keyring.gpg --no-default-keyring --fingerprint 34161F5BF5EB1D4BFBBB8F0A8AEB4F8B29D82806 && echo "deb [arch=amd64 signed-by=/usr/share/keyrings/conda-archive-keyring.gpg] https://repo.anaconda.com/pkgs/misc/debrepo/conda stable main" > /etc/apt/sources.list.d/conda.list && echo "deb [arch=amd64 signed-by=/usr/share/keyrings/conda-archive-keyring.gpg] https://repo.anaconda.com/pkgs/misc/debrepo/conda stable main" | tee -a /etc/apt/sources.list.d/conda.list && apt-get update && apt-get install -y conda]: exit code: 100

glebnaz commented 8 months ago

@themeaningofmeaning i have the same error with docker. Should we wait help from author?

themeaningofmeaning commented 8 months ago

@glebnaz @muellest - I haven't been able to figure out a solution as it appears to be an issue with the release and Apple Silicon. I haven't tried installing conda or Miniconda before doing a Docker build....that would be one thing to test, but I think we need to wait to hear from the author. I've tried every build in the book and it's not working on the M1

muellest commented 8 months ago

Perhaps this is an alternative to using Docker: https://localai.io/basics/build/ (not tested yet).

gaoyifan commented 8 months ago

The following patch allows me to compile on the Apple Silicon platform. However, the error messages I encountered during compilation are different from @glebnaz , so please take them as reference only.

diff --git a/Makefile b/Makefile
index 6afc644..4414eaa 100644
--- a/Makefile
+++ b/Makefile
@@ -112,7 +112,7 @@ ifeq ($(BUILD_TYPE),hipblas)
 endif

 ifeq ($(BUILD_TYPE),metal)
-   CGO_LDFLAGS+=-framework Foundation -framework Metal -framework MetalKit -framework MetalPerformanceShaders
+   CGO_LDFLAGS+=-framework Foundation -framework Metal -framework MetalKit -framework MetalPerformanceShaders -framework CoreML
    export LLAMA_METAL=1
    export WHISPER_METAL=1
 endif
glebnaz commented 8 months ago

@gaoyifan For me your patch doesn't help.

go mod edit -replace github.com/nomic-ai/gpt4all/gpt4all-bindings/golang=/Users/glebnaz/Documents/#main/workspace/localai/sources/gpt4all/gpt4all-bindings/golang
go mod edit -replace github.com/go-skynet/go-ggml-transformers.cpp=/Users/glebnaz/Documents/#main/workspace/localai/sources/go-ggml-transformers
go mod edit -replace github.com/donomii/go-rwkv.cpp=/Users/glebnaz/Documents/#main/workspace/localai/sources/go-rwkv
go mod edit -replace github.com/ggerganov/whisper.cpp=/Users/glebnaz/Documents/#main/workspace/localai/sources/whisper.cpp
go mod edit -replace github.com/ggerganov/whisper.cpp/bindings/go=/Users/glebnaz/Documents/#main/workspace/localai/sources/whisper.cpp/bindings/go
go mod edit -replace github.com/go-skynet/go-bert.cpp=/Users/glebnaz/Documents/#main/workspace/localai/sources/go-bert
go mod edit -replace github.com/mudler/go-stable-diffusion=/Users/glebnaz/Documents/#main/workspace/localai/sources/go-stable-diffusion
go mod edit -replace github.com/M0Rf30/go-tiny-dream=/Users/glebnaz/Documents/#main/workspace/localai/sources/go-tiny-dream
go mod edit -replace github.com/mudler/go-piper=/Users/glebnaz/Documents/#main/workspace/localai/sources/go-piper
go mod download
touch prepare-sources
touch prepare
CGO_LDFLAGS=" -lcblas -framework Accelerate -framework Foundation -framework Metal -framework MetalKit -framework MetalPerformanceShaders -framework CoreML" C_INCLUDE_PATH=/Users/glebnaz/Documents/#main/workspace/localai/sources/go-ggml-transformers LIBRARY_PATH=/Users/glebnaz/Documents/#main/workspace/localai/sources/go-ggml-transformers \
    go build -ldflags " -X "github.com/go-skynet/LocalAI/internal.Version=v2.4.1-2-g62a02cd" -X "github.com/go-skynet/LocalAI/internal.Commit=62a02cd1feb7cf8a75bcbe253fb95f204f022c1f"" -tags "" -o backend-assets/grpc/falcon-ggml ./backend/go/llm/falcon-ggml/
backend/go/llm/transformers/dolly.go:11:2: falcon.go: malformed #cgo argument: -I/Users/glebnaz/Documents/#main/workspace/localai/sources/go-ggml-transformers/ggml.cpp/include/
make: *** [backend-assets/grpc/falcon-ggml] Error 1

but thank you for your comment

muellest commented 7 months ago

Anything new on this topic? - Are there any identified workarounds or solutions?

@mudler , could you please confirm whether this can be recognized as a confirmed bug?

glebnaz commented 7 months ago

@mudler do you have time to participate to solution?

mudler commented 7 months ago

I don't have a Mac (yet) to double check, but the CI doesn't show the same symptoms. Actually the falcon-ggml backend is a candidate for deprecation (see: #1126 for more context).

Maybe @dave-gray101 can chime in here, I think you fixed MacOS builds lately right?

However, in any case I think you could workaround that but selectively choose what backend to build during build time, for instance to build only llama.cpp, you can run:

make GRPC_BACKENDS=backend-assets/grpc/llama-cpp build

see also: https://localai.io/basics/build/#build-only-a-single-backend

muellest commented 7 months ago

@dave-gray101 Can you support/help us?

Cyb3rDudu commented 7 months ago

I am reproducable able to build 2.7 with BUILD_TYPE=metal BUILD_GRPC_FOR_BACKEND_LLAMA=true on Mac M2 Max(12) on Sonoma with 32GB Ram. However it's a memory heavy compilation and takes a while. I was not able to build it by linking against installed dependencies.

EDIT:

make BUILD_TYPE=metal BUILD_GRPC_FOR_BACKEND_LLAMA=true build  5209.78s user 20787.18s system 526% cpu 1:22:15.54 total
Cyb3rDudu commented 7 months ago

I was able to compile today with tts,stablediffusion and tinydream backend. It builds but it's a bit of a mess. I made a PR for one issue but I'm not sure where to implement the remaining fixes.

qdrddr commented 4 months ago

Please consider adding Core ML model package format support to utilize Apple Silicone Nural Engine + GPU.

List of Core ML package format models

https://github.com/likedan/Awesome-CoreML-Models

qdrddr commented 4 months ago

Here is some additional info about running LLMs locally on Apple Silicone. Core ML is a framework that can redistribute workload across CPU, GPU & Nural Engine (ANE). ANE is available on all modern Apple Devices: iPhones & Macs (A14 or newer and M1 or newer). Ideally, we want to run LLMs on ANE only as it has optimizations for running ML tasks compared to GPU. Apple claims "deploying your Transformer models on Apple devices with an A14 or newer and M1 or newer chip to achieve up to 10 times faster and 14 times lower peak memory consumption compared to baseline implementations".

  1. To utilize Core ML first, you need to convert a model from TensorFlow, PyTorch to Core ML model package format using coremltools (or simply utilize existing models in Core ML package format ).
  2. Second, you must now use that converted package with an implementation designed for Apple Devices. Here is the Apple XCode reference PyTorch implementation.

https://machinelearning.apple.com/research/neural-engine-transformers

qdrddr commented 4 months ago

You might also be interested in another implementation Swift Transformers.

Work in progress on CoreML implementation for [whisper.cpp]. They see x3 performance improvements for some models. (https://github.com/ggerganov/whisper.cpp/discussions/548) you might be interested in. Example of CoreML application https://github.com/huggingface/swift-chat

benmarte commented 4 months ago

I'm getting a different error when trying to follow the Build on Mac instructions, it seems like the gpt4all-backend repo changed or something and it's giving me this error:

make BUILD_TYPE=metal BUILD_GRPC_FOR_BACKEND_LLAMA=true build
git clone --recurse-submodules https://github.com/go-skynet/go-llama.cpp sources/go-llama.cpp
Cloning into 'sources/go-llama.cpp'...
remote: Enumerating objects: 775, done.
remote: Counting objects: 100% (443/443), done.
remote: Compressing objects: 100% (173/173), done.
remote: Total 775 (delta 314), reused 339 (delta 258), pack-reused 332
Receiving objects: 100% (775/775), 238.97 KiB | 3.02 MiB/s, done.
Resolving deltas: 100% (451/451), done.
Submodule 'llama.cpp' (https://github.com/ggerganov/llama.cpp) registered for path 'llama.cpp'
Cloning into '/Users/benmarte/Documents/github/LocalAI/sources/go-llama.cpp/llama.cpp'...
remote: Enumerating objects: 23832, done.        
remote: Counting objects: 100% (8015/8015), done.        
remote: Compressing objects: 100% (301/301), done.        
remote: Total 23832 (delta 7845), reused 7743 (delta 7714), pack-reused 15817        
Receiving objects: 100% (23832/23832), 43.51 MiB | 18.81 MiB/s, done.
Resolving deltas: 100% (16921/16921), done.
Submodule path 'llama.cpp': checked out 'ac43576124a75c2de6e333ac31a3444ff9eb9458'
cd sources/go-llama.cpp && git checkout -b build 2b57a8ae43e4699d3dc5d1496a1ccd42922993be && git submodule update --init --recursive --depth 1
Switched to a new branch 'build'
git clone --recurse-submodules https://github.com/nomic-ai/gpt4all sources/gpt4all
Cloning into 'sources/gpt4all'...
remote: Enumerating objects: 14638, done.
remote: Counting objects: 100% (4322/4322), done.
remote: Compressing objects: 100% (314/314), done.
remote: Total 14638 (delta 4122), reused 4044 (delta 4008), pack-reused 10316
Receiving objects: 100% (14638/14638), 14.57 MiB | 18.04 MiB/s, done.
Resolving deltas: 100% (9960/9960), done.
Submodule 'llama.cpp-mainline' (https://github.com/nomic-ai/llama.cpp.git) registered for path 'gpt4all-backend/llama.cpp-mainline'
Cloning into '/Users/benmarte/Documents/github/LocalAI/sources/gpt4all/gpt4all-backend/llama.cpp-mainline'...
remote: Enumerating objects: 15299, done.        
remote: Counting objects: 100% (4764/4764), done.        
remote: Compressing objects: 100% (157/157), done.        
remote: Total 15299 (delta 4687), reused 4607 (delta 4607), pack-reused 10535        
Receiving objects: 100% (15299/15299), 20.29 MiB | 18.50 MiB/s, done.
Resolving deltas: 100% (10663/10663), done.
Submodule path 'gpt4all-backend/llama.cpp-mainline': checked out 'a3f03b7e793ee611c4918235d4532ee535a9530d'
Submodule 'kompute' (https://github.com/nomic-ai/kompute.git) registered for path 'gpt4all-backend/llama.cpp-mainline/kompute'
Cloning into '/Users/benmarte/Documents/github/LocalAI/sources/gpt4all/gpt4all-backend/llama.cpp-mainline/kompute'...
remote: Enumerating objects: 9083, done.        
remote: Counting objects: 100% (218/218), done.        
remote: Compressing objects: 100% (130/130), done.        
remote: Total 9083 (delta 96), reused 168 (delta 78), pack-reused 8865        
Receiving objects: 100% (9083/9083), 17.57 MiB | 17.10 MiB/s, done.
Resolving deltas: 100% (5716/5716), done.
Submodule path 'gpt4all-backend/llama.cpp-mainline/kompute': checked out 'd1e3b0953cf66acc94b2e29693e221427b2c1f3f'
cd sources/gpt4all && git checkout -b build 27a8b020c36b0df8f8b82a252d261cda47cf44b8 && git submodule update --init --recursive --depth 1
fatal: not a git repository: ../../.git/modules/llama.cpp-230511
fatal: could not reset submodule index
make: *** [sources/gpt4all] Error 128

This is my system report:

Model Name: MacBook Pro
Chip:   Apple M1 Max
Total Number of Cores:  10 (8 performance and 2 efficiency)
Memory: 64 GB
System Firmware Version:    10151.101.3
OS Loader Version:  10151.101.3