Closed lucyknada closed 8 months ago
For model loading on startup, you can set parameters in config.yml
. Please look at config_sample.yml and adopt to your needs accordingly.
Also please read the Configuration section of the README.
Basically, copy over config_sample.yml
to config.yml
, comment out (or remove) what you don't need, replace what you do need, and start TabbyAPI.
In your case, I'd replace model_name
with the folder of the model you want to load and model_dir
with the directory where your models are kept.
These parameters will select the model to load on startup and you can change the model via the API later on through the /v1/model/unload
and /v1/model/load
endpoints.
I'm aware of this thanks, but if I want to have multiple shortcuts for different models, I can't do that; I need to edit the config each time (or send a webrequest to the api once it started fully), switching between 4 or so models becomes very cumbersome
What you're describing is a little different from model switching, but rather a way to quickly override init without opening the config.yml file. Backends such as ooba and koboldcpp are argument first while tabby is a config-first program since it's more flexible and much easier for headless.
Personally, I'm not a fan of argparse due to the amount of arguments that a user has to remember.
But, a small argparser for a case like yours is compelling. It won't contain everything because that's what config.yml is for.
I'll think about it more, and will keep this open.
Implemented in bb7a8e4614dad659de7833c23de68a3f25e29dd3
This may expand over time. But it's fulfilled the needs of this issue.
To add, this is only for startup. Args or config cannot hot reload a model while the API is running. You must use requests or a UI for that.
I know tabby is supposed to be mostly API based; but would a PR that introduces a --model argument be appropriate / accepted? the reason is that many of us come from koboldcpp and ooba, which both allow to create a shortcut with the model as its parameter, not as many run tabby just to then send a webrequest to change model.
I know there's a gradio that kind of mimics ooba in that sense I assume, but that would still not allow to immediately load a model on click.
The total changes would really just be something like:
and replace:
https://github.com/theroyallab/tabbyAPI/blob/f5314fcdad13524379c5519cccfc496f35f4ba51/main.py#L508
with
Thanks!