DFKI-NLP / perseus-textgen

A repository for scripts to run awesomely large language models with text generation inference APIs and (chat) UIs
MIT License
0 stars 0 forks source link

initial example setup #1

Closed ArneBinder closed 10 months ago

ArneBinder commented 1 year ago

Taken from here.

Start the Container:

srun -K   \
--container-image=/netscratch/enroot/text-generation-inference_1.0.3.sqsh \
--container-mounts=/netscratch:/netscratch,/ds:/ds,/ds/models/llms/cache:/data,$HOME:$HOME \
--container-workdir=$HOME \
-p A100-40GB \
--mem 64GB \
--gpus 1 \
--export MODEL_ID=lmsys/vicuna-13b-v1.5 \
text-generation-launcher \
--port 5000

When startup is complete (this should be written on the console: INFO text_generation_router: router/src/main.rs:247: Connected),

NOTE: The API documentation also works as a simple web UI!

Example generation command:

curl http://serv-3329.kl.dfki.de:5000/generate \
    -X POST \
    -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":20}}' \
    -H 'Content-Type: application/json'

TODO:

References:

ArneBinder commented 10 months ago

Since the scripts are well tested, they also contain some documentation and we have the frontend, I think we can close this now.