Open kenzic opened 3 weeks ago
(Meta comment: Are you aware of the Prompt API proposal?)
(Also, did you see https://github.com/WICG/proposals/issues/147 and https://github.com/WICG/proposals/issues/163 (both accepted) for more concrete tasks like translation, language detection, summarization, writing, and rewriting?)
Yes, I saw those, and have worked with the prompt api, which is what inspired this proposal. I see this as an alternative approach to implementing AI APIs, one that makes them more open and flexible. This proposal isn’t about focusing on specific tasks like prompt generation or summarization. Instead, it’s about creating a bridge to a model runtime that gives access to models suited to each specific use case. I think building separate APIs for each task—like prompt-specific or translation-specific APIs—is the wrong direction. It’s better to have an API that lets the developer or user choose the model with the right capabilities for their needs.
(Great, was just wondering, since your proposal didn't mention these previous efforts.)
Thanks for the feedback. I updated the description to include "Other Considerations" which discusses this.
Introduction
The Browser AI API is a proposal for a new browser feature that makes AI models accessible directly on users' devices through the browser. By offering a simple API available on the
window
object, this approach would allow websites to leverage AI without sending data to the cloud, preserving privacy, reducing latency, and enabling offline functionality.This is about empowering developers to integrate advanced AI into web apps—without needing heavy infrastructure—while giving users control over their data and the models they choose to run. Imagine a world where on-device AI enhances web apps in real-time, with no data leaving the device and no reliance on external servers.
The API would let developers:
By running models directly on the user's hardware, we’re opening up new possibilities for AI-driven web apps while keeping things secure, private, and available offline.
Prototype
I created a prototype of this concept for review: https://github.com/kenzic/browser.ai
API Overview
The proposed API would be exposed on the
window.ai
object with the following high-level structure:Permissions
Before using any models, websites must first query available models and request permission:
Model Sessions
Once permission is granted, websites can create sessions to interact with models:
Chat
Embed
WebIDL
Key Benefits
Use Cases
This API opens the door for all kinds of innovative web applications:
Technical Considerations
Other Considerations
There are similar proposals, such as Prompt API proposal
I see this as an alternative approach to implementing AI APIs, one that makes them more open and flexible. This proposal isn’t about focusing on specific tasks like prompt generation or summarization. Instead, it’s about creating a bridge to a model runtime that gives access to models suited to each specific use case. I think building separate APIs for each task—like prompt-specific or translation-specific APIs—is the wrong direction. It’s better to have an API that lets the developer or user choose the model with the right capabilities for their needs.
Feedback
Please provide all feedback below.