Open jasl opened 10 months ago
Out of topic: GitHub and my local editor warn that many files don't have end-file-new-line.
Do you accept style PR? Or run git ls-files -z | while IFS= read -rd '' f; do if file --mime-encoding "$f" | grep -qv binary; then tail -c1 < "$f" | read -r _ || echo >> "$f"; fi; done
to ensure new-line to all files
@dusty-nv is there any way to auto-test all containers? now I tested what I modified and looks good
Thanks @jasl, in an effort to understand ENTRYPOINT vs CMD I mapped out these common usages: (a) starting the enduser app (server) with different args and (b) running a different process all together (model downloader)
CMD
style example
./run.sh --workdir /opt/text-generation-webui $(./autotag text-generation-webui) \
python3 server.py --listen --verbose --api \
--model-dir=/data/models/text-generation-webui \
--model=llama-2-13b-chat.ggmlv3.q4_0.bin \
--loader=llamacpp \
--n-gpu-layers=128 \
--n_ctx=4096 \
--n_batch=4096
ENTRYPOINT
style example
./run.sh $(./autotag text-generation-webui) \
--model=llama-2-13b-chat.ggmlv3.q4_0.bin \
--loader=llamacpp \
--n-gpu-layers=128 \
--n_ctx=4096 \
--n_batch=4096
CMD
style, download model
./run.sh --workdir=/opt/text-generation-webui $(./autotag text-generation-webui) /bin/bash -c \
'python3 download-model.py --output=/data/models/text-generation-webui TheBloke/Llama-2-7b-Chat-GPTQ'
ENTRYPOINT
style, download model
./run.sh --workdir=/opt/text-generation-webui --entrypoint 'python3 download-model.py' \
$(./autotag text-generation-webui) \
--output=/data/models/text-generation-webui TheBloke/Llama-2-7b-Chat-GPTQ
I agree that if user is only adding args, ENTRYPOINT is nicer. What I don't prefer is doing more complicated invocations with it, without needing another bash script that you mount in, then run. But in general I am leaning towards ENTRYPOINT for these. Can you confirm that my last example actually works?
@dusty-nv
It seems the Dockerfile is broken
From https://github.com/haotian-liu/text-generation-webui
* [new branch] dev -> llava2/dev
* [new branch] main -> llava2/main
Auto-merging extensions/multimodal/pipelines/llava/pipelines.py
CONFLICT (content): Merge conflict in extensions/multimodal/pipelines/llava/pipelines.py
Auto-merging extensions/multimodal/pipelines/llava/llava.py
CONFLICT (content): Merge conflict in extensions/multimodal/pipelines/llava/llava.py
error: could not apply 9c085ee... Initial support for LLaVA-LLaMA-2.
hint: after resolving the conflicts, mark the corrected paths
hint: with 'git add <paths>' or 'git rm <paths>'
hint: and commit the result with 'git commit'
The command '/bin/sh -c cd text-generation-webui && git remote add llava2 https://github.com/haotian-liu/text-generation-webui && git fetch llava2 && git config user.email "dustinf@nvidia.com" && git config user.name "Dustin Franklin" && git cherry-pick 9c085ee' returned a non-zero code: 1
I do a workaround (just let it not block me)
You can do this if you want to call python3 download-model.py
.
./run.sh --workdir=/opt/text-generation-webui --entrypoint /usr/bin/env $(./autotag text-generation-webui) python3 download-model.py --output=/data/models/text-generation-webui TheBloke/Llama-2-7b-Chat-GPTQ
The entry point should be /usr/bin/env
then you can access full ENV vars (PATH
for calling python
directly and other env vars for pip
staff)
Here's the proof
Does this look good to you?
friendly ping @dusty-nv
I want to add extra arg
--api
to thesd-webui
, then I found the current Dockerfile not easy to make. With the entry point style, I could add extra args by./run.sh $(./autotag stable-diffusion-webui) --api
which convenient for people.I also checked other images and made the same change.
In addition, I set
WORKDIR
to the app folder.What do you think?