Open Gordonei opened 6 months ago
@Gordonei that's not a difference between the server and model configs, repeat_penalty is specified on a per request basis and penalize_newline is not a supported option. That being said I do see the value of being able to override default parameters in cases where you don't have control of the client.
Penalize nl would have to be done seperately though but could be a model / sampler wide config.
Probably the most relevant case I can think of is the stop
parameters, which are different on a per model basis. It's nice to be able to hide those sorts of implementation details from clients of the API.
Would it just be a matter of adding the relevant fields to the ConfigFileSettings
, or would there need to be additional plumbing needed? The reason I ask is that would point me in the right direction for putting together a PR.
Thanks for a great project!
I might be misunderstanding or missing some limitation that doesn't allow for certain fields to be specified in a standalone config file, but there seem to be many server params which aren't supported on a per model basis?
Prerequisites
Please answer the following questions for yourself before submitting an issue.
Expected Behavior
To be able to set all of the constants found in the Server Types in ConfigFileSettings
Current Behavior
Only a subset of config params are available in the Model Configuration
Environment and Context
Running in Docker with
CONFIG_FILE=/config/config.json python3 -m llama_cpp.server
, with config file mounted to/config/config.json
Failure Information (for bugs)
This is an explicit type error
Steps to Reproduce
Create model config JSON, and attempt to set any of the unsupported parameters:
Failure Logs
As expected, the config file validation fails on startup: