simple API interface for initializing and running model inference

Lightning-AI / litgpt

Pretrain, finetune, deploy 20+ LLMs on your own data. Uses state-of-the-art techniques: flash attention, FSDP, 4-bit, LoRA, and more.

https://lightning.ai

Apache License 2.0

6.95k stars 733 forks source link

simple API interface for initializing and running model inference #1329

Open aniketmaurya opened 1 month ago

aniketmaurya commented 1 month ago

We want to serve LLMs from LitGPT using LitServe, however the current model initialization step leaks a lot of complexity at the user code. Also, couldn't find a generator function to stream responses. So had to bring the generate function on user code side.

A simple API to do this kinds of thing in a few lines of code would be really appreciated!

cc: @lantiga