Use entrypoint style for end-user app

jasl commented 10 months ago

I want to add extra arg --api to the sd-webui, then I found the current Dockerfile not easy to make. With the entry point style, I could add extra args by ./run.sh $(./autotag stable-diffusion-webui) --api which convenient for people.

I also checked other images and made the same change.

In addition, I set WORKDIR to the app folder.

What do you think?

jasl commented 10 months ago

Out of topic: GitHub and my local editor warn that many files don't have end-file-new-line.

Do you accept style PR? Or run git ls-files -z | while IFS= read -rd '' f; do if file --mime-encoding "$f" | grep -qv binary; then tail -c1 < "$f" | read -r _ || echo >> "$f"; fi; done to ensure new-line to all files

jasl commented 10 months ago

@dusty-nv is there any way to auto-test all containers? now I tested what I modified and looks good

dusty-nv commented 10 months ago

Thanks @jasl, in an effort to understand ENTRYPOINT vs CMD I mapped out these common usages: (a) starting the enduser app (server) with different args and (b) running a different process all together (model downloader)

A. Start server with arguments

CMD style example

./run.sh --workdir /opt/text-generation-webui $(./autotag text-generation-webui) \
   python3 server.py --listen --verbose --api \
    --model-dir=/data/models/text-generation-webui \
    --model=llama-2-13b-chat.ggmlv3.q4_0.bin \
    --loader=llamacpp \
    --n-gpu-layers=128 \
    --n_ctx=4096 \
    --n_batch=4096

ENTRYPOINT style example

./run.sh $(./autotag text-generation-webui) \
    --model=llama-2-13b-chat.ggmlv3.q4_0.bin \
    --loader=llamacpp \
    --n-gpu-layers=128 \
    --n_ctx=4096 \
    --n_batch=4096

B. Start different process

CMD style, download model

./run.sh --workdir=/opt/text-generation-webui $(./autotag text-generation-webui) /bin/bash -c \
  'python3 download-model.py --output=/data/models/text-generation-webui TheBloke/Llama-2-7b-Chat-GPTQ'

ENTRYPOINT style, download model

./run.sh --workdir=/opt/text-generation-webui --entrypoint 'python3 download-model.py' \
         $(./autotag text-generation-webui) \
     --output=/data/models/text-generation-webui TheBloke/Llama-2-7b-Chat-GPTQ

I agree that if user is only adding args, ENTRYPOINT is nicer. What I don't prefer is doing more complicated invocations with it, without needing another bash script that you mount in, then run. But in general I am leaning towards ENTRYPOINT for these. Can you confirm that my last example actually works?

jasl commented 10 months ago

@dusty-nv

It seems the Dockerfile is broken

From https://github.com/haotian-liu/text-generation-webui
 * [new branch]      dev        -> llava2/dev
 * [new branch]      main       -> llava2/main
Auto-merging extensions/multimodal/pipelines/llava/pipelines.py
CONFLICT (content): Merge conflict in extensions/multimodal/pipelines/llava/pipelines.py
Auto-merging extensions/multimodal/pipelines/llava/llava.py
CONFLICT (content): Merge conflict in extensions/multimodal/pipelines/llava/llava.py
error: could not apply 9c085ee... Initial support for LLaVA-LLaMA-2.
hint: after resolving the conflicts, mark the corrected paths
hint: with 'git add <paths>' or 'git rm <paths>'
hint: and commit the result with 'git commit'
The command '/bin/sh -c cd text-generation-webui &&     git remote add llava2 https://github.com/haotian-liu/text-generation-webui &&     git fetch llava2 &&     git config user.email "dustinf@nvidia.com" &&     git config user.name "Dustin Franklin" &&     git cherry-pick 9c085ee' returned a non-zero code: 1

I do a workaround (just let it not block me)

You can do this if you want to call python3 download-model.py.

./run.sh --workdir=/opt/text-generation-webui --entrypoint /usr/bin/env $(./autotag text-generation-webui) python3 download-model.py --output=/data/models/text-generation-webui TheBloke/Llama-2-7b-Chat-GPTQ

The entry point should be /usr/bin/env then you can access full ENV vars (PATH for calling python directly and other env vars for pip staff)

Here's the proof

Does this look good to you?

jasl commented 9 months ago

friendly ping @dusty-nv

dusty-nv / jetson-containers

Use entrypoint style for end-user app #310

A. Start server with arguments

B. Start different process