WICG / proposals

A home for well-formed proposed incubations for the web platform. All proposals welcome.
https://wicg.io/
Other
233 stars 16 forks source link

Embeddings support in Chrome local #177

Open roschler opened 4 weeks ago

roschler commented 4 weeks ago

Introduction

The Chrome local API with Gemini Nano support opens the door to doing Retrieval Augmented Generation completely client side in the browser. However, to achieve that goal completely we also need client side embeddings.

Use Cases (Recommended)

The use case is as stated above, supporting Retrieval Augmented Generation

Goals (Optional)

Embed client side, or load at start-up, something like Google's Universal Sentence Encoder and expose it via a Chrome local API.

Non-goals (Optional)

It would be great to also have an Hierarchical Navigable Small World (HNSW) index support too client side, but my belief is that this can wait, because it can search and compare thousands of vectors in a reasonable amount of time.

Proposed Solution

The fastest route would be to embed, or load at startup, Google's Universal Sentence Encoder and expose it via a Chrome local API.

Examples (Recommended)

Developers would use it to create vector embeddings for records that would be stored in IndexedDB, and again whenever a query needed to be encoded in preparation for a vector search.

Privacy & Security Considerations

This approach would greatly increase privacy and security for the user because as of now, if you want to implement a RAG/LLM pipeline in the browser, even with the support of a local instance of Gemini Nano, you still have to ship text out to an embedding service. You could encrypt the text and then have a back-end server under your control execute the embeddings call that way, but that is less safe and more work than a completely client side approach.

backkem commented 4 weeks ago

Do you have a reference to "Chrome local API"?

tomayac commented 4 weeks ago

Do you have a reference to "Chrome local API"?

This is referring to Chrome's built-in AI APIs.