explainers-by-googlers / prompt-api

A proposal for a web API for prompting browser-provided language models
Creative Commons Attribution 4.0 International
91 stars 5 forks source link

Choose model #8

Open niutech opened 4 days ago

niutech commented 4 days ago

There could be more than one LLM in a web browser (built-in or added as a web extension). Let's show users the list of available LLMs (using their IDs) and allow them to optionally choose a model when creating a session.

For example:

const models = await ai.listModels(); // ['gemini-nano', 'phi-3-mini']
const session = await ai.createTextSession({
  model: models[1] // 'phi-3-mini'
});
const modelInfo = await ai.textModelInfo(models[1]); // {id: 'phi-3-mini', version: '3.0', defaultTemperature: 0.5, defaultTopK: 3, maxTopK: 10}
captainbrosset commented 4 days ago

Let's show users the list of available LLMs

I guess you meant developer here, not user, right?

niutech commented 4 days ago

Users of the API, i.e. developers.

christianliebel commented 4 days ago

I assume this could be problematic as it would create a fingerprinting vector, compromising user privacy. Additionally, this approach might lack forward compatibility, as models are likely to evolve and change over time. A more robust solution could be to expose metadata about each model, such as context window size, number of parameters, supported languages, and relevant capabilities (translation, etc.). This way, developers can make informed decisions based on the features and performance characteristics they need without directly exposing model IDs.

niutech commented 4 days ago

@christianliebel How would exposing model ID be more problematic in terms of fingerprinting when there is already user-agent name and version available, as well as textModelInfo, which easily deduces which LLM is built-in (Google Chrome -> Gemini Nano). I'm proposing to return even more detailed model metadata:

const modelInfo = await ai.textModelInfo('gemini-nano'); // {id: 'gemini-nano', version: '1.0', defaultTemperature: 0.8, defaultTopK: 3, maxTopK: 10}

This would allow web developers to choose the best fitting local model depending on use-case (e.g. math, reasoning, poetry). Also there should be a way to add custom models as web extensions (#11).

christianliebel commented 4 days ago

The composition of models (especially when you register custom ones) could be pretty unique, similar to fonts.

niutech commented 3 days ago

@christianliebel It's the same level of uniqueness as when detecting which web extensions are installed like extension-detector. Even ad blockers can be detected. I think the possibility to choose among multiple local LLMs justifies slightly bigger fingerprinting surface. If you care about privacy, you just won't install any additional LLMs.