Closed asselinpaul closed 1 year ago
Have been playing around with this and believe don't need a GPU here / will save money by only charging you for GPU use when running the model (not when serving the app)
https://github.com/jxnl/fastllm/blob/5174e711f21c2342d9b39a363b106532e9e15f08/applications/vllm-struct/main.py#L120
Have been playing around with this and believe don't need a GPU here / will save money by only charging you for GPU use when running the model (not when serving the app)
https://github.com/jxnl/fastllm/blob/5174e711f21c2342d9b39a363b106532e9e15f08/applications/vllm-struct/main.py#L120