meshachaderele / ddsc-llm

2 stars 0 forks source link

Well done and next steps #1

Open KasperGroesLudvigsen opened 3 weeks ago

KasperGroesLudvigsen commented 3 weeks ago

Very nice work, Meshach! I guess the next step would be to set up an LLM API in a Docker container on the GPU server (e.g. with VLLM) so that we can substitute your call to "gpt-3.5-turbo-instruct" with a local model. What do you think?

KasperGroesLudvigsen commented 3 weeks ago

Also, it would be really cool to measure the energy consumption per token while we run inference

KasperGroesLudvigsen commented 3 weeks ago

Or maybe its better to use offline batching with vllm https://docs.vllm.ai/en/v0.6.0/getting_started/quickstart.html