Closed ruze00 closed 1 month ago
@ruze00 Docker files have just received a huge update, clone this repo again, then clone llama.cpp (the latest b3408 has been tested today with both container flavors) into the docker folder you're attempting to build and then re-run the docker-build.
Watch over the coming days for detailed updates to the README documentation too.
Important update @ruze00:
While the docker containers in this repo have received a huge update, to make things as easy as possible, I'm also uploading pre-built images for both, the CPU-only & NvCUDA-GPU flavors so you needn't even build the container yourself!
Given this, from this point forth, the recommended way to deploy LARS for regular use is via Docker and local setup will be recommended only for developers looking to contribute. Watch out for a significant update to the README documentation in the coming days.
You needn't wait until then though. Please download the image(s) directly from the links below (Google Drive as it's way too large to host on GitHub where we're restricted to a few GB at best):
CPU-only image (32.7GB): https://drive.google.com/file/d/1diTTxevnyq6qMOTh-6f_YN7UBNS55Dy0/view?usp=sharing
Nvidia CUDA-GPU image (50GB): https://drive.google.com/file/d/1s3o8QFAHFhuww9ekBmhZcLd33MIUCYl1/view?usp=sharing
As stated in the Docker section, simply download and install Docker Desktop on to your Mac.
Once Docker is setup and the LARS pre-built image downloaded, simply load it into Docker via the following terminal command:
docker load -i lars-cpu-only-b3408-pfr.tar
Ensure you create a storage volume so you can update the LARS containers to newer versions later without losing any app data, settings & models:
docker volume create lars_storage
Then simply run via the terminal:
# CPU-only container:
docker run -p 5000:5000 -p 8080:8080 -v lars_storage:/app/storage lars-cpu-only-b3408-pfr
# Nvidia-GPU container:
docker run --gpus all -p 5000:5000 -p 8080:8080 -v lars_storage:/app/storage lars-nvcuda-gpu-b3408-pfr
Then navigate to http://localhost:5000
in a browser and once the app loads with the expected first run error, you can copy your LLM GGUF files into the storage volume, again via the terminal:
docker ps # note the container ID of the LARS container
docker cp <path_to_llm_gguf> <container_id_obtained_above>:/app/storage/models
# an example of how this looks:
docker cp C:\web_app_storage\models\Google-Gemma2-9B-f16.gguf
85cbbfc016bb:/app/storage/models
While this may seem slightly involved, there a lot of elegance & beauty to this approach:
🍻
You need to execute following in the folder dockerized: $ git clone https://github.com/ggerganov/llama.cpp
I assume this command is missing in the dockerfile.
Thanks @tobias
As detailed in my comment above, just use the prebuilt images. In fact here’s the latest pre-built images, featuring the latest LARS update from yesterday: an ‘Upload LLM’ button & progress indicator for the upload has been added so you needn’t even use the ‘docker cp’ command via the terminal to move LLMs in any more! Just download, load and run and never use the terminal again! As long as you created the volume and attached it the first time, your data, chat history, documents, app settings, LLMs etc will all persist even when moving to newer container versions of LARS!
CPU-only image: https://drive.google.com/file/d/1-GBGEwpWNLle9PqH0EptCIiP_mrmosyp/view?usp=drivesdk
NvCUDA GPU image: https://drive.google.com/file/d/1IlFcZSzD92yL93U_TH1s9P2CndX2rS_Q/view?usp=drivesdk
Also updated to the latest version of llama.cpp as of last night as indicated by the 'b3423' tag in the image name. And again, these are "post first-run" images as indicated by the 'pfr' suffix, meaning all embedding models have been downloaded so these can be run entirely offline right away.
And yes, I haven’t added the ‘git clone’ as the llama.cpp project moves so fast and changes between versions can cause the container to malfunction. For instance, the recent change from ‘server’ to ‘llama-server’. Thus, I prefer to link directly to the release that I’ve personally tested as working. As I said though, the documentation update is WIP!
From: Tobias Sandhaas @.> Sent: Saturday, July 20, 2024 12:24:10 PM To: abgulati/LARS @.> Cc: Abheek Gulati @.>; Assign @.> Subject: Re: [abgulati/LARS] Docker build fails (Issue #16)
You need to execute following in the folder dockerized: $ git clone https://github.com/ggerganov/llama.cpp
I assume this command is missing in the dockerfile.
— Reply to this email directly, view it on GitHubhttps://github.com/abgulati/LARS/issues/16#issuecomment-2241267903, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AEZB3X2DJNHXJXPDCAYOHODZNK2NVAVCNFSM6AAAAABK2X5BIOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENBRGI3DOOJQGM. You are receiving this because you were assigned.Message ID: @.***>
Maybe I can update the dockerfile though to include a 'git clone' of the specific tested release though! I'll do this and push an update.
From: Abheek Gulati @.> Sent: Saturday, July 20, 2024 12:52:42 PM To: abgulati/LARS @.>; abgulati/LARS @.> Cc: Assign @.> Subject: Re: [abgulati/LARS] Docker build fails (Issue #16)
Thanks @tobias
As detailed in my comment above, just use the prebuilt images. In fact here’s the latest pre-built images, featuring the latest LARS update from yesterday: an ‘Upload LLM’ button & progress indicator for the upload has been added so you needn’t even use the ‘docker cp’ command via the terminal to move LLMs in any more! Just download, load and run and never use the terminal again! As long as you created the volume and attached it the first time, your data, chat history, documents, app settings, LLMs etc will all persist even when moving to newer container versions of LARS!
CPU-only image: https://drive.google.com/file/d/1-GBGEwpWNLle9PqH0EptCIiP_mrmosyp/view?usp=drivesdk
NvCUDA GPU image: https://drive.google.com/file/d/1IlFcZSzD92yL93U_TH1s9P2CndX2rS_Q/view?usp=drivesdk
Also updated to the latest version of llama.cpp as of last night as indicated by the 'b3423' tag in the image name. And again, these are "post first-run" images as indicated by the 'pfr' suffix, meaning all embedding models have been downloaded so these can be run entirely offline right away.
And yes, I haven’t added the ‘git clone’ as the llama.cpp project moves so fast and changes between versions can cause the container to malfunction. For instance, the recent change from ‘server’ to ‘llama-server’. Thus, I prefer to link directly to the release that I’ve personally tested as working. As I said though, the documentation update is WIP!
From: Tobias Sandhaas @.> Sent: Saturday, July 20, 2024 12:24:10 PM To: abgulati/LARS @.> Cc: Abheek Gulati @.>; Assign @.> Subject: Re: [abgulati/LARS] Docker build fails (Issue #16)
You need to execute following in the folder dockerized: $ git clone https://github.com/ggerganov/llama.cpp
I assume this command is missing in the dockerfile.
— Reply to this email directly, view it on GitHubhttps://github.com/abgulati/LARS/issues/16#issuecomment-2241267903, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AEZB3X2DJNHXJXPDCAYOHODZNK2NVAVCNFSM6AAAAABK2X5BIOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENBRGI3DOOJQGM. You are receiving this because you were assigned.Message ID: @.***>
Detailed resolution steps and pre-build Docker images provided, closing since no updates received.
When trying to create a docker image in
dockerized
(no GPU) folder, I get the following error: