when changing from nitro.exe rest service to cortex-cpp.exe some breaking changes are introduced
load and unload models with
before: /inferences/llamacpp/[un]loadmodel has changed to
after: /inferences/server/[un]loadmodel
otherwise you het 404
this is a minor issue as long as its known and described
chat-completion changed
/v1/chat/completions
together with the promt some parameters are sent, like temperature, max_tokens, etc.
before: modelid is not required
after: modelid is required
I don't understand why i need to send the modelid, i have loaded the model. It is possible to load several models ?
Steps to reproduce the behavior:
send a post request to the webapi /v1/chat/completions
when changing from nitro.exe rest service to cortex-cpp.exe some breaking changes are introduced
Steps to reproduce the behavior: send a post request to the webapi /v1/chat/completions