Make it work in other browsers using local LLM

niutech commented 4 months ago

Gemini Nano weights from Google Chrome are on HuggingFace. You can run the inference using this model with MediaPipe LLM inference based on WebGPU.

Please add it as a fallback for other browsers, so that they could chat offline with the on-device LLM. The model could be loaded from local weights.bin file, like in MediaPipe Studio.

Erisfiregamer1 commented 4 months ago

I'm working on a polyfill extension for window.ai, that'd probably be better since I could either download the model or just use a non-local one in it's place

niutech commented 4 months ago

Do you mean https://github.com/alexanderatallah/window.ai? The MediaPipe solution uses only WebGPU and Wasm, so it doesn't require running LLM server on localhost, unlike Window.ai extension.

Erisfiregamer1 commented 4 months ago

Do you mean https://github.com/alexanderatallah/window.ai? The MediaPipe solution uses only WebGPU and Wasm, so it doesn't require running LLM server on localhost, unlike Window.ai extension.

No, that's a completely different API.

My extension works differently by design. Do you want me to explain more?

niutech commented 4 months ago

@Erisfiregamer1 Yes, please. Will it use the Gemini Nano model from HuggingFace or another local model such as Microsoft Phi-3 Mini?

Erisfiregamer1 commented 4 months ago

I’ll see about embedding Gemini Nano, but I will add options to upload your own MediaPipe compatible model later post release.For the proof of concept I’ve just gone with using the Groq API to make sure everything works.

Erisfiregamer1 commented 4 months ago

Update: CWS approved the extension and I'm testing it to work now. As of currently it just uses embedded Gemini Nano weights. Thanks CWS reviewers!

https://chromewebstore.google.com/detail/windowai++/gbbjmkbjbhcaiklmooemafmeamlgoilf

niutech commented 4 months ago

@Erisfiregamer1 Great! Where can I find the source code of the web extension? I'd like to know what the content script which has access to every website does.

EDIT: From the quick inspection of CRX, window.ai++ does not use the LoRA adaptation for Instruct (adaptation_weights.bin). You should include this file for better inference and use it as follows:

// init
let llmInference, loraAdaptation;
LlmInference.createFromOptions(genaiFileset, {
        baseOptions: {modelAssetPath: 'gemininano.bin'},
        loraRanks: [32]
}).then(llm => {
        llmInference = llm;
        return llm.loadLoraModel('adaptation_weights.bin');
}).then(lora => {
        loraAdaptation = lora;
});

// prompt
llmInference.generateResponse(prompt, loraAdaptation, callback);

Next, there is a XSS vulnerability in script.js, which is importing any script from unsanitized localModelTaskGenAIUrl URL param.

Also, it would be great if there was a context menu item which runs LLM inference on selected text (e.g. summarize). I could help you work on it when the source is open.

Erisfiregamer1 commented 4 months ago

@Erisfiregamer1 Great! Where can I find the source code of the web extension? I'd like to know what the content script which has access to every website does.

EDIT: From the quick inspection of CRX, window.ai++ does not use the LoRA adaptation for Instruct (adaptation_weights.bin). You should include this file for better inference and use it as follows:
// init
let llmInference, loraAdaptation;
LlmInference.createFromOptions(genaiFileset, {
        baseOptions: {modelAssetPath: 'gemininano.bin'},
        loraRanks: [32]
}).then(llm => {
        llmInference = llm;
        return llm.loadLoraModel('adaptation_weights.bin');
}).then(lora => {
        loraAdaptation = lora;
});

// prompt
llmInference.generateResponse(prompt, loraAdaptation, callback);
Next, there is a XSS vulnerability in script.js, which is importing any script from unsanitized localModelTaskGenAIUrl URL param.

Also, it would be great if there was a context menu item which runs LLM inference on selected text (e.g. summarize). I could help you work on it when the source is open.

Good thinking about the LoRA weights. Already gemini nano's weights are huge as shit though, so we'll see if I can add it.

As for the URL param, that's because I can't pass the URL it needs to load any other way. This is literally just me being limited by the tools I have. The script immediately deletes itself, and it loads the params from the script URL itself, which I doubt anyone can access or change in the singular millisecond the script exists. It'll be fine.

I'll make a Github repo for the extension, weights NOT included, soon.

UPDATE: Okay, 119mb isn't bad, I'll add it to v0.2 of the extension. At the moment though it's not being added to the Github repo so Google can't smite me for publishing their weights. :(

Erisfiregamer1 commented 4 months ago

Also I won't be adding a summarize context menu. This was intended to be nothing but a polyfill extension for Chrome's API. Maybe I'll make a seperate extension for that.

niutech commented 4 months ago

@Erisfiregamer1 Thanks for replying. As for Github, I am looking forward to see it! You don't need to publish the weights, just the source code (put *.bin in .gitignore). As for XSS, it's easy to exploit the URL param even if the <script> exists for a millisecond. Just prepare a website with a script which overrides document.createElement() to inject malicious localModelTaskGenAIUrl URL param. Why not just use chrome.storage in the content script to store parameters?

Erisfiregamer1 commented 4 months ago

@Erisfiregamer1 Thanks for replying. As for Github, I am looking forward to see it! You don't need to publish the weights, just the source code (put *.bin in .gitignore). As for XSS, it's easy to exploit the URL param even if the <script> exists for a millisecond. Just prepare a website with a script which overrides document.createElement() to inject malicious localModelTaskGenAIUrl URL param. Why not just use chrome.storage in the content script to store parameters?

Great to hear! Also, the document.createElement trick wouldn't work- I create the script inside the content script, which is configured to load before everything on the page.

Also I do use chrome.storage! I just have to pass it into the Githubissues.

Githubissues is a development platform for aggregating issues.

lightning-joyce / chromeai

Make it work in other browsers using local LLM #3