mlc-ai / web-llm

High-performance In-browser LLM Inference Engine
https://webllm.mlc.ai
Apache License 2.0
13.38k stars 860 forks source link

The chrome extension example does not work #316

Closed vivekrao1985 closed 4 months ago

vivekrao1985 commented 7 months ago

I tried out the chrome-extension example as-is and I got errors that the model does not exist. If you click on the url, it indeed does not exist. I replace /resolve with /blob and that gets me past the error, but now I see -

Unexpected token '<', "<!doctype "... is not valid JSON

Not sure how to proceed, any help would be greatly appreciated.

CharlieFRuan commented 7 months ago

The /resolve should be correct; the unmatching model id might caused the issue. Let me know if this fixed it: https://github.com/mlc-ai/web-llm/pull/317 Thanks!

vivekrao1985 commented 7 months ago

That did it! Though I get a new error now -

Uncaught Error: Cannot find adapter that matches the request

And a warning before it -

WebGPU on Linux requires GLES compat, or command-line flag --enable-features=Vulkan, or command-line flag --enable-features=SkiaGraphite (and skia_use_dawn = true GN arg)

Do I need Unsafe WebGPU support to be enabled? If so, is this project for experimental purposes only? Not meant to be used in a production scenario?

CharlieFRuan commented 7 months ago

Which browser are you using? We're mostly tested on Chrome -- e.g. for the latest version of Chrome, Macbooks do not need any flag to run things.

You could also go to webgpureport.org to see what it says there; normally if the report is fine, web llm should work fine.

vivekrao1985 commented 7 months ago

I'm running Ubuntu 22.04 on an EliteBook with a TigerLake GT2. I had to enable a few things -

google-chrome --enable-unsafe-webgpu --enable-dawn-features=allow_unsafe_apis --enable-features=Vulkan,UseSkiaRenderer

Now I see this -

Uncaught Error: This model requires WebGPU extension shader-f16

webgpureport.org does not list shader-f16 as a feature -

features:  
---------  
bgra8unorm-storage  
chromium-experimental-subgroup-uniform-control-flow  
chromium-experimental-subgroups  
chromium-experimental-timestamp-query-inside-passes  
depth-clip-control  
depth32float-stencil8  
float32-filterable  
indirect-first-instance  
rg11b10ufloat-renderable  
texture-compression-astc  
texture-compression-bc  
texture-compression-etc2  
timestamp-query

Chrome version is 122.0.6261.69

CharlieFRuan commented 7 months ago

I see; not sure if using Chrome Canary would make things easier (e.g. not needing the flags).

Regarding shader-f16, try using models that have f32 rather than f16. You can freely change the model in https://github.com/mlc-ai/web-llm/blob/main/examples/chrome-extension/src/popup.ts#L44-L53 to any in https://github.com/mlc-ai/web-llm/blob/main/examples/simple-chat/src/gh-config.js

gsal commented 4 months ago

A colleague has a web-based app that stopped working, recently; IT indicates that WebGPU is an unsafe extension.....so, I am googling to see where I can post a question and find out where such concern is coming from and run into this thread and the --enable-unsafe-webgpu...I guess it's rather telling that even the option itself includes the word unsafe.....what is that about?

IT says that the use of WebGPU may be ok in a standalone/desktop situation; but, we are provided access to Linux (Rocky9, chrome-124) via Nice DCV and, at any given time, up to a dozen user could be logged in into the same machine.

Thoughts?