As discussed in https://github.com/janhq/jan/issues/3745, there are fixed parameters from the endpoints of the o1 models. Therefore, the current model.json files should be updated accordingly.
This PR aims to correct the transform payload to the models and also to return graceful error messages as soon as inference errors occur.
A few unsupported parameters have been removed to prevent confusion.
Fixes Issues
3745
3771
Changes made
This git diff summarizes a set of code changes across multiple files. Here's a concise breakdown:
sse.ts (Helper for SSE requests):
Added a check for an overridden stream parameter in the request body. If stream is false either in the request body or model parameters, it adjusts accordingly.
models.json (Model Configuration File):
Removed the stream, stop, and some other parameters from the model configurations.
Updated max_tokens values to significantly higher numbers (32768 and 65536).
Adjusted temperature to 1 and top_p to 1.
index.ts (inference-openai-extension):
Simplified the transformPayload function by removing unused destructured variables temperature, top_p, and stop.
Kept stream: false for handling specific models that only support non-streaming requests.
package.json (Model Extension):
Incremented the package version from 1.0.34 to 1.0.35.
inference-cortex-extension/src/index.ts:
Reordered queue operations during the onLoad method of the JanInferenceCortexExtension class:
The call to this.clean() remains, but the surrounding queue structure is slightly rearranged.
The call to executeOnMain is now added to the queue using this.queue.add().
The calls to this.healthz() and this.setDefaultEngine(systemInfo) are also added to the queue afterward.
model-extension/src/index.ts:
Modified the return logic for an empty toImportModels array in the JanModelExtension class:
Instead of immediately returning fetchedModels, the logic now concatenates legacyModels that have the vision_model setting with fetchedModels.
These changes include modifications to model configurations and payload transformation logic for specific model handling, package version bump.
Describe Your Changes
As discussed in https://github.com/janhq/jan/issues/3745, there are fixed parameters from the endpoints of the o1 models. Therefore, the current model.json files should be updated accordingly.
This PR aims to correct the transform payload to the models and also to return graceful error messages as soon as inference errors occur.
A few unsupported parameters have been removed to prevent confusion.
Fixes Issues
3745
3771
Changes made
This git diff summarizes a set of code changes across multiple files. Here's a concise breakdown:
sse.ts
(Helper for SSE requests):stream
parameter in the request body. Ifstream
is false either in the request body or model parameters, it adjusts accordingly.models.json
(Model Configuration File):stream
,stop
, and some other parameters from the model configurations.max_tokens
values to significantly higher numbers (32768
and65536
).temperature
to 1 andtop_p
to 1.index.ts
(inference-openai-extension
):transformPayload
function by removing unused destructured variablestemperature
,top_p
, andstop
.stream: false
for handling specific models that only support non-streaming requests.package.json
(Model Extension):1.0.34
to1.0.35
.inference-cortex-extension/src/index.ts
:onLoad
method of theJanInferenceCortexExtension
class:this.clean()
remains, but the surrounding queue structure is slightly rearranged.executeOnMain
is now added to the queue usingthis.queue.add()
.this.healthz()
andthis.setDefaultEngine(systemInfo)
are also added to the queue afterward.model-extension/src/index.ts
:toImportModels
array in theJanModelExtension
class:fetchedModels
, the logic now concatenateslegacyModels
that have thevision_model
setting withfetchedModels
.These changes include modifications to model configurations and payload transformation logic for specific model handling, package version bump.