abgulati / LARS

An application for running LLMs locally on your device, with your documents, facilitating detailed citations in generated responses.
https://www.youtube.com/watch?v=Mam1i86n8sU&ab_channel=AbheekGulati
GNU Affero General Public License v3.0
449 stars 30 forks source link

Docker build fails #16

Closed ruze00 closed 1 month ago

ruze00 commented 1 month ago

When trying to create a docker image in dockerized (no GPU) folder, I get the following error:

 > [7/9] RUN cmake -B build     && cmake --build build --config Release -j 16:
0.142 -- The C compiler identification is GNU 12.2.0
0.182 -- The CXX compiler identification is GNU 12.2.0
0.187 -- Detecting C compiler ABI info
0.224 -- Detecting C compiler ABI info - done
0.229 -- Check for working C compiler: /usr/bin/cc - skipped
0.229 -- Detecting C compile features
0.229 -- Detecting C compile features - done
0.231 -- Detecting CXX compiler ABI info
0.271 -- Detecting CXX compiler ABI info - done
0.276 -- Check for working CXX compiler: /usr/bin/c++ - skipped
0.277 -- Detecting CXX compile features
0.277 -- Detecting CXX compile features - done
0.280 -- Found Git: /usr/bin/git (found version "2.39.2")
0.283 fatal: not a git repository (or any of the parent directories): .git
0.283 fatal: not a git repository (or any of the parent directories): .git
0.286 -- Performing Test CMAKE_HAVE_LIBC_PTHREAD
0.322 -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
0.323 -- Found Threads: TRUE
0.442 -- Found OpenMP_C: -fopenmp (found version "4.5")
0.488 -- Found OpenMP_CXX: -fopenmp (found version "4.5")
0.489 -- Found OpenMP: TRUE (found version "4.5")
0.489 -- OpenMP found
0.489 -- Using ggml SGEMM
0.489 -- Warning: ccache not found - consider installing it for faster compilation or disable this warning with GGML_CCACHE=OFF
0.497 -- CMAKE_SYSTEM_PROCESSOR: aarch64
0.497 -- ARM detected
0.498 -- Performing Test COMPILER_SUPPORTS_FP16_FORMAT_I3E
0.512 -- Performing Test COMPILER_SUPPORTS_FP16_FORMAT_I3E - Failed
0.517 CMake Warning at common/CMakeLists.txt:30 (message):
0.517   Git repository not found; to enable automatic generation of build info,
0.517   make sure Git is installed and the project is a Git repository.
0.517
0.517
0.530 -- Configuring done
0.590 -- Generating done
0.596 -- Build files have been written to: /app/llama-cpp-b3334/build
0.619 [  0%] Generating build details from Git
0.624 CMake Error: Error processing file: /app/llama-cpp-b3334/common/cmake/build-info-gen-cpp.cmake
0.624 [  1%] Building C object ggml/src/CMakeFiles/ggml.dir/ggml.c.o
0.624 [  2%] Building C object ggml/src/CMakeFiles/ggml.dir/ggml-alloc.c.o
0.624 [  2%] Building C object ggml/src/CMakeFiles/ggml.dir/ggml-backend.c.o
0.625 gmake[2]: *** [common/CMakeFiles/build_info.dir/build.make:74: /app/llama-cpp-b3334/common/build-info.cpp] Error 1
0.625 gmake[1]: *** [CMakeFiles/Makefile2:1688: common/CMakeFiles/build_info.dir/all] Error 2
0.625 gmake[1]: *** Waiting for unfinished jobs....
0.625 [  3%] Building C object ggml/src/CMakeFiles/ggml.dir/ggml-quants.c.o
0.625 [  4%] Building C object examples/gguf-hash/CMakeFiles/xxhash.dir/deps/xxhash/xxhash.c.o
0.625 [  5%] Building C object examples/gguf-hash/CMakeFiles/sha256.dir/deps/sha256/sha256.c.o
0.626 [  5%] Building C object examples/gguf-hash/CMakeFiles/sha1.dir/deps/sha1/sha1.c.o
0.627 [  5%] Building CXX object ggml/src/CMakeFiles/ggml.dir/sgemm.cpp.o
0.715 In function ‘SHA1Update’,
0.715     inlined from ‘SHA1Final’ at /app/llama-cpp-b3334/examples/gguf-hash/deps/sha1/sha1.c:265:5:
0.715 /app/llama-cpp-b3334/examples/gguf-hash/deps/sha1/sha1.c:219:13: warning: ‘SHA1Transform’ reading 64 bytes from a region of size 0 [-Wstringop-overread]
0.715   219 |             SHA1Transform(context->state, &data[i]);
0.715       |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0.715 /app/llama-cpp-b3334/examples/gguf-hash/deps/sha1/sha1.c:219:13: note: referencing argument 2 of type ‘const unsigned char[64]’
0.715 /app/llama-cpp-b3334/examples/gguf-hash/deps/sha1/sha1.c: In function ‘SHA1Final’:
0.715 /app/llama-cpp-b3334/examples/gguf-hash/deps/sha1/sha1.c:54:6: note: in a call to function ‘SHA1Transform’
0.715    54 | void SHA1Transform(
0.715       |      ^~~~~~~~~~~~~
0.715 In function ‘SHA1Update’,
0.715     inlined from ‘SHA1Final’ at /app/llama-cpp-b3334/examples/gguf-hash/deps/sha1/sha1.c:269:9:
0.715 /app/llama-cpp-b3334/examples/gguf-hash/deps/sha1/sha1.c:219:13: warning: ‘SHA1Transform’ reading 64 bytes from a region of size 0 [-Wstringop-overread]
0.715   219 |             SHA1Transform(context->state, &data[i]);
0.715       |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0.715 /app/llama-cpp-b3334/examples/gguf-hash/deps/sha1/sha1.c:219:13: note: referencing argument 2 of type ‘const unsigned char[64]’
0.715 /app/llama-cpp-b3334/examples/gguf-hash/deps/sha1/sha1.c: In function ‘SHA1Final’:
0.715 /app/llama-cpp-b3334/examples/gguf-hash/deps/sha1/sha1.c:54:6: note: in a call to function ‘SHA1Transform’
0.715    54 | void SHA1Transform(
0.715       |      ^~~~~~~~~~~~~
0.738 [  5%] Built target sha1
0.748 [  5%] Built target sha256
1.825 [  5%] Built target xxhash
4.921 [  6%] Linking CXX shared library libggml.so
4.950 [  6%] Built target ggml
4.950 gmake: *** [Makefile:146: all] Error 2
------

 1 warning found (use --debug to expand):
 - LegacyKeyValueFormat: "ENV key=value" should be used instead of legacy "ENV key value" format (line 31)
dockerfile:27
--------------------
  26 |     WORKDIR /app/llama-cpp-b3334
  27 | >>> RUN cmake -B build \
  28 | >>>  && cmake --build build --config Release -j 16
  29 |
--------------------
ERROR: failed to solve: process "/bin/sh -c cmake -B build \t&& cmake --build build --config Release -j 16" did not complete successfully: exit code: 2
abgulati commented 1 month ago

@ruze00 Docker files have just received a huge update, clone this repo again, then clone llama.cpp (the latest b3408 has been tested today with both container flavors) into the docker folder you're attempting to build and then re-run the docker-build.

Watch over the coming days for detailed updates to the README documentation too.

abgulati commented 1 month ago

Important update @ruze00:

While the docker containers in this repo have received a huge update, to make things as easy as possible, I'm also uploading pre-built images for both, the CPU-only & NvCUDA-GPU flavors so you needn't even build the container yourself!

Given this, from this point forth, the recommended way to deploy LARS for regular use is via Docker and local setup will be recommended only for developers looking to contribute. Watch out for a significant update to the README documentation in the coming days.

You needn't wait until then though. Please download the image(s) directly from the links below (Google Drive as it's way too large to host on GitHub where we're restricted to a few GB at best):

CPU-only image (32.7GB): https://drive.google.com/file/d/1diTTxevnyq6qMOTh-6f_YN7UBNS55Dy0/view?usp=sharing

Nvidia CUDA-GPU image (50GB): https://drive.google.com/file/d/1s3o8QFAHFhuww9ekBmhZcLd33MIUCYl1/view?usp=sharing

As stated in the Docker section, simply download and install Docker Desktop on to your Mac.

Once Docker is setup and the LARS pre-built image downloaded, simply load it into Docker via the following terminal command:

docker load -i lars-cpu-only-b3408-pfr.tar

Ensure you create a storage volume so you can update the LARS containers to newer versions later without losing any app data, settings & models:

docker volume create lars_storage

Then simply run via the terminal:

# CPU-only container:

docker run -p 5000:5000 -p 8080:8080 -v lars_storage:/app/storage lars-cpu-only-b3408-pfr

# Nvidia-GPU container: 

docker run --gpus all -p 5000:5000 -p 8080:8080 -v lars_storage:/app/storage lars-nvcuda-gpu-b3408-pfr

Then navigate to http://localhost:5000 in a browser and once the app loads with the expected first run error, you can copy your LLM GGUF files into the storage volume, again via the terminal:

docker ps # note the container ID of the LARS container

docker cp <path_to_llm_gguf> <container_id_obtained_above>:/app/storage/models

# an example of how this looks: 

docker cp C:\web_app_storage\models\Google-Gemma2-9B-f16.gguf 
85cbbfc016bb:/app/storage/models

While this may seem slightly involved, there a lot of elegance & beauty to this approach:

  1. It's a one-time setup
  2. Beyond Docker itself, you don't need to install anything else!
  3. You get to use stuff like Azure OCR and other libs that don't work natively on Mac
  4. The pre-built images are already fully-setup for use: they're "Post First-Run", thus the "pfr" suffix, indicating that all three embedding models have already been downloaded into the images so switching embedding models will be much faster and the app can be used entirely offline
  5. As we're attaching a storage volume, all your app data, uploaded models and app settings will persist even when updating your container to a newer LARS container version released down the road
  6. Best of all (for non-technical users), you never need to use the terminal again! On subsequent runs, just launch the Docker Desktop app and click play on the container:

image

🍻

medlor commented 1 month ago

You need to execute following in the folder dockerized: $ git clone https://github.com/ggerganov/llama.cpp

I assume this command is missing in the dockerfile.

abgulati commented 1 month ago

Thanks @tobias

As detailed in my comment above, just use the prebuilt images. In fact here’s the latest pre-built images, featuring the latest LARS update from yesterday: an ‘Upload LLM’ button & progress indicator for the upload has been added so you needn’t even use the ‘docker cp’ command via the terminal to move LLMs in any more! Just download, load and run and never use the terminal again! As long as you created the volume and attached it the first time, your data, chat history, documents, app settings, LLMs etc will all persist even when moving to newer container versions of LARS!

CPU-only image: https://drive.google.com/file/d/1-GBGEwpWNLle9PqH0EptCIiP_mrmosyp/view?usp=drivesdk

NvCUDA GPU image: https://drive.google.com/file/d/1IlFcZSzD92yL93U_TH1s9P2CndX2rS_Q/view?usp=drivesdk

Also updated to the latest version of llama.cpp as of last night as indicated by the 'b3423' tag in the image name. And again, these are "post first-run" images as indicated by the 'pfr' suffix, meaning all embedding models have been downloaded so these can be run entirely offline right away.

And yes, I haven’t added the ‘git clone’ as the llama.cpp project moves so fast and changes between versions can cause the container to malfunction. For instance, the recent change from ‘server’ to ‘llama-server’. Thus, I prefer to link directly to the release that I’ve personally tested as working. As I said though, the documentation update is WIP!


From: Tobias Sandhaas @.> Sent: Saturday, July 20, 2024 12:24:10 PM To: abgulati/LARS @.> Cc: Abheek Gulati @.>; Assign @.> Subject: Re: [abgulati/LARS] Docker build fails (Issue #16)

You need to execute following in the folder dockerized: $ git clone https://github.com/ggerganov/llama.cpp

I assume this command is missing in the dockerfile.

— Reply to this email directly, view it on GitHubhttps://github.com/abgulati/LARS/issues/16#issuecomment-2241267903, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AEZB3X2DJNHXJXPDCAYOHODZNK2NVAVCNFSM6AAAAABK2X5BIOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENBRGI3DOOJQGM. You are receiving this because you were assigned.Message ID: @.***>

abgulati commented 1 month ago

Maybe I can update the dockerfile though to include a 'git clone' of the specific tested release though! I'll do this and push an update.


From: Abheek Gulati @.> Sent: Saturday, July 20, 2024 12:52:42 PM To: abgulati/LARS @.>; abgulati/LARS @.> Cc: Assign @.> Subject: Re: [abgulati/LARS] Docker build fails (Issue #16)

Thanks @tobias

As detailed in my comment above, just use the prebuilt images. In fact here’s the latest pre-built images, featuring the latest LARS update from yesterday: an ‘Upload LLM’ button & progress indicator for the upload has been added so you needn’t even use the ‘docker cp’ command via the terminal to move LLMs in any more! Just download, load and run and never use the terminal again! As long as you created the volume and attached it the first time, your data, chat history, documents, app settings, LLMs etc will all persist even when moving to newer container versions of LARS!

CPU-only image: https://drive.google.com/file/d/1-GBGEwpWNLle9PqH0EptCIiP_mrmosyp/view?usp=drivesdk

NvCUDA GPU image: https://drive.google.com/file/d/1IlFcZSzD92yL93U_TH1s9P2CndX2rS_Q/view?usp=drivesdk

Also updated to the latest version of llama.cpp as of last night as indicated by the 'b3423' tag in the image name. And again, these are "post first-run" images as indicated by the 'pfr' suffix, meaning all embedding models have been downloaded so these can be run entirely offline right away.

And yes, I haven’t added the ‘git clone’ as the llama.cpp project moves so fast and changes between versions can cause the container to malfunction. For instance, the recent change from ‘server’ to ‘llama-server’. Thus, I prefer to link directly to the release that I’ve personally tested as working. As I said though, the documentation update is WIP!


From: Tobias Sandhaas @.> Sent: Saturday, July 20, 2024 12:24:10 PM To: abgulati/LARS @.> Cc: Abheek Gulati @.>; Assign @.> Subject: Re: [abgulati/LARS] Docker build fails (Issue #16)

You need to execute following in the folder dockerized: $ git clone https://github.com/ggerganov/llama.cpp

I assume this command is missing in the dockerfile.

— Reply to this email directly, view it on GitHubhttps://github.com/abgulati/LARS/issues/16#issuecomment-2241267903, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AEZB3X2DJNHXJXPDCAYOHODZNK2NVAVCNFSM6AAAAABK2X5BIOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENBRGI3DOOJQGM. You are receiving this because you were assigned.Message ID: @.***>

abgulati commented 1 month ago

Detailed resolution steps and pre-build Docker images provided, closing since no updates received.