TabbyML / tabby

Self-hosted AI coding assistant
https://tabbyml.com
Other
21.89k stars 997 forks source link

Ability to configure model params #915

Open sundaraa-deshaw opened 11 months ago

sundaraa-deshaw commented 11 months ago

Please describe the feature you want

We have a use-case of running a fine-tuned model and would like to serve it from Tabby server. Since Tabby supports llama-cpp embedding and does not support http binding to other endpoinds (besides, fastchat and vertex-ai), we would want to be able to configure some of the model params as documented in https://github.com/ggerganov/llama.cpp/blob/019ba1dcd0c7775a5ac0f7442634a330eb0173cc/common/common.cpp#L1344

e.g. we would want to configure a lower temperature, higher number of gpu layers, configure top_p and top_k for use with our custom fine tuned model

Can we add this support please?


Please reply with a 👍 if you want this feature.

wsxiaoys commented 11 months ago

configure a lower temperature, higher number of gpu layers, configure top_p and top_k for use with our custom fine tuned model

Hey - could you please share some actual use case that you aim to achieve by setting these parameters?