architecture: Remote API Extension Revamp

dan-homebrew commented 1 month ago

Goal

Remote API extensions are modular (e.g. separate Github repo)
Remote API extensions can refresh their model list on-demand (preferably by calling their list API)
OR: users can specify the model list in Remote Extensions (e.g. comma separated list)
Remote API Extensions can fetch an updated model list (e.g. params that can be passed in)
Users can select model from list, once
Users should only see major models (i.e. not nightly)
We will not need model.yaml for each remote model
[Stretch] Users can add additional model names in the Remote API's Settings page (e.g. nightly)

Tasklist

[ ] Review all current issues
[ ] Propose Architecture Spec for Remote API Extension: https://github.com/orgs/janhq/discussions/3568
[ ] Document Remote API Extension API

Out-of-scope

Getting people to add Remote API Extensions
Refactor Remote API Extensions into separate repo/npm package (e.g. groq-extension)

Tasklist

Remote API Extensions

[ ] Nvidia NIM
- https://github.com/janhq/jan/issues/3157
[ ] OpenRouter
[ ] OpenAI
[ ] Groq

Existing Issues

[ ] #3452
[x] #3157
[ ] #3392
[ ] https://github.com/janhq/jan/issues/3452
[ ] https://github.com/janhq/jan/issues/3194
[ ] https://github.com/janhq/jan/issues/3370
[x] https://github.com/janhq/jan/issues/3374

0xSage commented 1 month ago

Dupe of https://github.com/janhq/jan/issues/3374

louis-jan commented 2 weeks ago

Separation of Concerns

How models list work?
- Remote extensions should work with autopopulating models, aka /models list.
- We could not build hundreds model.json files manually.
- The current extension framework is actually designed to handle this, it's just an implementation issue from extensions, which can be improved.
- There was a hacky UI implementation where we pre-populated models, then disabled all of them until the API key was set. That should be a part of the extension, not the Jan app.
- Extension builder still ships default available models. We don't close the door, we improve the example.
```
// Before
override async onLoad(): Promise<void> {
super.onLoad()
// Register Settings (API Key, Endpoints)
this.registerSettings(SETTINGS)

// Pre-populate models - persist model.json files
// MODELS are model.json files that come with the extension.
this.registerModels(MODELS)
}
```
// After override async onLoad(): Promise { super.onLoad() // Register Settings (API Key, Endpoints) this.registerSettings(SETTINGS)

// Fetch models from provider models endpoint - just a simple fetch // Default to /models get('/models') .then((models) => { // Model builder will construct model template (aka preset) // This operation builds Model DTOs that works with the app. this.registerModels(this.modelBuilder.build(models)) }) }

Remote Provider Extension

Draw.io

https://drive.google.com/file/d/1pl9WjCzKl519keva85aHqUhx2u0onVf4/view?usp=sharing

Supported parameters?
- Each provider works with different parameters, but they all share the same basic function with the current ones defined.
- We've already supported transformPayload and transformResponse to adapt to these cases.
- So users still see parameters consistent from model to model, but the magic happens behind the scenes, where the transformations are simplified under the hood.
```
/**
```
- transformPayload Example
- Tranform the payload before sending it to the inference endpoint.
- The new preview models such as o1-mini and o1-preview replaced max_tokens by max_completion_tokens parameter.
- Others do not. */ transformPayload = (payload: OpenAIPayloadType): OpenAIPayloadType => { // Transform the payload for preview models if (this.previewModels.includes(payload.model)) { const { max_tokens, ...params } = payload return { ...params, max_completion_tokens: max_tokens } } // Pass through for officialw models return payload }
Decoration?
- We've currently hard-coded many provider metadata from Jan, which could cause issues with future installed extensions.
- The decoration should be done from the Extension Manifest (package.json).
- https://code.visualstudio.com/api/references/extension-manifest
```
{
"name": "openai-extension",
"displayName": "OpenAI Extension Provider",
"icon": "https://openai.com/logo.png"
}
```
Just remove the hacky parts from Jan.
- Model Dropdown: It checks if the engine is nitro or others, filtering for local versus cloud sections. New local engines will be treated as remote engines (e.g. cortex.cpp). -> Filter by Extension type (class name or type, e.g. LocalOAIEngine vs RemoteOAIEngine).
- All models from the cloud provider are disabled by default if no API key is set. What if I use a self-hosted endpoint without API key restrictions? Models available or not should be determined from the extensions, when there are no credentials to meet the requirements, it will result in an empty section, indicating no available models. When users input the API-Key from extension settings page, it will fetch model list automatically and cache. Users can also refresh the models list from there (should not fetch so many times, we are building a local-first application)
- Application settings can be a bit confusing, with Model Providers and Core Extensions listed separately. Where do other extensions fit in?

Extension settings do not have a community or "others" section

Extensions installation is a straightforward process that requires minimal effort.
- There is no official way to install extensions from a GitHub repository URL. Users typically don't know how to package and install software from sources.
- There should be a shortcut from the settings page that allows users to input the URL, pop up the extension repository details, and then install from there.
It would be helpful to provide a list of community extensions, allowing users to easily find the right extension for their specific use case without having to search.

dan-homebrew commented 1 week ago

Idea from @norrybul: https://github.com/janhq/models/issues/23#issuecomment-2381136479

louis-jan commented 1 week ago

Idea from @norrybul: janhq/models#23 (comment)

Hi @dan-homebrew, that's what we initially thought we should do, but there are a couple of problems, so we've pushed back the Custom OAI Extension:

Limitations of UI support in extensions.
Model pre-population would establish a 1-1 mapping between model.json and extension settings. Once the Model Cache is complete, the extension no longer relies on model.json.
Why not use existing extensions? E.g. OpenAI, OpenRouter...
It's a good example of community extension?

janhq / jan

architecture: Remote API Extension Revamp #3505

Goal

Tasklist

Out-of-scope

Tasklist

Remote API Extensions

Existing Issues

Separation of Concerns