Ability to configure model params

Please describe the feature you want

We have a use-case of running a fine-tuned model and would like to serve it from Tabby server. Since Tabby supports llama-cpp embedding and does not support http binding to other endpoinds (besides, fastchat and vertex-ai), we would want to be able to configure some of the model params as documented in https://github.com/ggerganov/llama.cpp/blob/019ba1dcd0c7775a5ac0f7442634a330eb0173cc/common/common.cpp#L1344

e.g. we would want to configure a lower temperature, higher number of gpu layers, configure top_p and top_k for use with our custom fine tuned model

Can we add this support please?

Please reply with a 👍 if you want this feature.

TabbyML / tabby