explainers-by-googlers / prompt-api

A proposal for a web API for prompting browser-provided language models
Creative Commons Attribution 4.0 International
186 stars 10 forks source link

Register model #11

Open niutech opened 2 months ago

niutech commented 2 months ago

Let's allow developers to register a new LLM in a web browser as a web extension, which then would be able to be chosen in #8. The model would be in a TFLite FlatBuffers format, so that it was compatible with MediaPipe LLM Inference as a possible fallback for unsupported browsers (compatible with Gemini Nano).

The method to register/add a custom model could be invoked by web extension like this:

ai.registerModel({
    id: 'phi-3-mini',
    version: '3.0',
    file: 'chrome-extension://azipopnxdpcknwapfrtdedlnjjkmpnao/phi-3-mini.bin',
    loraFile: 'chrome-extension://azipopnxdpcknwapfrtdedlnjjkmpnao/phi-3-mini-lora.bin', // optional
    defaultTemperature: 0.5,
    defaultTopK: 3,
    maxTopK: 10
})

Then it could be listed by web apps like this:

const models = await ai.listModels(); // ['gemini-nano', 'phi-3-mini']

The model metadata could be accessed like this:

const modelInfo = await ai.textModelInfo('phi-3-mini'); // {id: 'phi-3-mini', version: '3.0', defaultTemperature: 0.5, defaultTopK: 3, maxTopK: 10}
captainbrosset commented 1 month ago

This is more or less what VS Code does. See https://code.visualstudio.com/api/extension-guides/language-model and https://code.visualstudio.com/api/references/vscode-api#lm

VSCode extension devs can use LLMs, and they do this by first choosing a model from a predefined list with selectChatModels. LLMs are contributed by other extensions, although I don't think the docs say how yet.

KenjiBaheux commented 1 week ago

Thanks for the detailed proposal.

While enabling developers to register custom LLMs via web extensions offers interesting possibilities, we need to carefully consider the implications of going with strong identifiers that might limit flexibility (e.g. avoiding an explosion of models or versions of a given model while what's already available could be sufficient, avoiding being stuck with what was popular at a given time while better options have become available), and also portability across browsers (e.g. I imagine that the chrome-extension://[id] may only make sense for Chrome). Especially with large models, it seems important to minimize an over-reliance on a specific version of a model, or a specific "location" (origin, or an extension ID) for the model.

See this related discussion: issue #5 that goes beyond the built-in AI APIs. We encourage you to engage there to contribute to a more future-proof solution.