initial example setup - Githubissues

Taken from here.

Start the Container:

srun -K   \
--container-image=/netscratch/enroot/text-generation-inference_1.0.3.sqsh \
--container-mounts=/netscratch:/netscratch,/ds:/ds,/ds/models/llms/cache:/data,$HOME:$HOME \
--container-workdir=$HOME \
-p A100-40GB \
--mem 64GB \
--gpus 1 \
--export MODEL_ID=lmsys/vicuna-13b-v1.5 \
text-generation-launcher \
--port 5000

When startup is complete (this should be written on the console: INFO text_generation_router: router/src/main.rs:247: Connected),

the API is available at: http://serv-3329.kl.dfki.de:5000
the API documentation can be accessed at: http://serv-3329.kl.dfki.de:5000/docs/

NOTE: The API documentation also works as a simple web UI!

Example generation command:

curl http://serv-3329.kl.dfki.de:5000/generate \
    -X POST \
    -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":20}}' \
    -H 'Content-Type: application/json'

TODO:

[x] better UI, e.g.
- https://github.com/huggingface/chat-ui, or
- https://github.com/oobabooga/text-generation-webui

References:

https://github.com/huggingface/text-generation-inference

DFKI-NLP / perseus-textgen

initial example setup #1