fix: correct OpenAI o1 model parameters

Describe Your Changes

As discussed in https://github.com/janhq/jan/issues/3745, there are fixed parameters from the endpoints of the o1 models. Therefore, the current model.json files should be updated accordingly.
This PR aims to correct the transform payload to the models and also to return graceful error messages as soon as inference errors occur.
A few unsupported parameters have been removed to prevent confusion.

Screenshot 2024-11-19 at 22 08 41 Screenshot 2024-11-19 at 22 09 11

Fixes Issues

3745
3771

Changes made

This git diff summarizes a set of code changes across multiple files. Here's a concise breakdown:

sse.ts (Helper for SSE requests):
- Added a check for an overridden stream parameter in the request body. If stream is false either in the request body or model parameters, it adjusts accordingly.
models.json (Model Configuration File):
- Removed the stream, stop, and some other parameters from the model configurations.
- Updated max_tokens values to significantly higher numbers (32768 and 65536).
- Adjusted temperature to 1 and top_p to 1.
index.ts (inference-openai-extension):
- Simplified the transformPayload function by removing unused destructured variables temperature, top_p, and stop.
- Kept stream: false for handling specific models that only support non-streaming requests.
package.json (Model Extension):
- Incremented the package version from 1.0.34 to 1.0.35.
inference-cortex-extension/src/index.ts:
- Reordered queue operations during the onLoad method of the JanInferenceCortexExtension class:
  - The call to this.clean() remains, but the surrounding queue structure is slightly rearranged.
  - The call to executeOnMain is now added to the queue using this.queue.add().
  - The calls to this.healthz() and this.setDefaultEngine(systemInfo) are also added to the queue afterward.
model-extension/src/index.ts:
- Modified the return logic for an empty toImportModels array in the JanModelExtension class:
  - Instead of immediately returning fetchedModels, the logic now concatenates legacyModels that have the vision_model setting with fetchedModels.

These changes include modifications to model configurations and payload transformation logic for specific model handling, package version bump.

janhq / jan

fix: correct OpenAI o1 model parameters #4049

Describe Your Changes

Fixes Issues

3745

3771

Changes made

Barecheck - Code coverage report