Open ggaabe opened 1 month ago
Than you for reporting this issue. I will try to figure out how to fix this problem.
So it turns out to be that dynamic import (ie. import()
) and top-level await
is not supported in current service worker. I was not expecting that import()
is banned in SW.
Currently, the WebAssembly factory (wasm-factory.ts) uses dynamic import to load the JS glue. This does not work in service worker. A few potential solutions are also not available:
await
.importScripts
: won't work, because the JS glue is ESMeval
: won't work; same to importScripts
I am now trying to make a JS bundle that does not use dynamic import for usage of service worker specifically. Still working on it
Thanks, I appreciate your efforts around this. It does seem like some special-case bundle will need to be built after all; you might need iife
or umd
for the bundler output format
Thanks, I appreciate your efforts around this. It does seem like some special-case bundle will need to be built after all; you might need
iife
orumd
for the bundler output format
I have considered this option. However, Emscripten does not offer an option to output both UMD(IIFE+CJS) & ESM for JS glue (https://github.com/emscripten-core/emscripten/issues/21899). I have to choose either. I choose the ES6 format output for the JS glue, because of a couple of problems when import UMD from ESM, and import()
is a standard way to import ESM from both ESM and UMD. ( Until I know its not working in service worker by this issue)
I found a way to make ORT web working, - yes this need the build script to do some special handling. And this will only work for ESM, because the JS glue is ESM and it seems no way to import ESM from UMD in service worker.
@ggaabe Could you please help to try import * as ort from “./ort.webgpu.bundle.min.js”
from version 1.19.0-dev.20240604-3dd6fcc089 ?
@fs-eire my project is dependent on transformersjs, which imports onnxruntime webgpu backend like this here:
https://github.com/xenova/transformers.js/blob/v3/src/backends/onnx.js#L24
Is this the right usage? In my project I've added this to my package.json to resolve onnx-runtime to this new version though the issue is still occurring:
"overrides": {
"onnxruntime-web": "1.19.0-dev.20240604-3dd6fcc089"
}
Maybe also important: The same error is still occurring in same spot in inference session in the onnx package and not from transformersjs. Do I need to add a resolver for onnxruntime-common as well?
Hi @fs-eire, is the newly-merged fix in a released build I can try?
Please try 1.19.0-dev.20240612-94aa21c3dd
@fs-eire EDIT: Nvm the comment I just deleted, that error was because I didn't set the webpack target
to webworker
.
However, I'm getting a new error now (progress!):
Error: no available backend found. ERR: [webgpu] RuntimeError: null function or function signature mismatch
Update: Found the error is happening in here: https://github.com/microsoft/onnxruntime/blob/fff68c3151b774d8a2e9290e96b9f707cd950216/js/common/lib/backend-impl.ts#L83-L86
For some reason the webgpu backend.init promise is rejecting due to the null function or function signature mismatch
error. This is much further along than we were before though.
Update: Found the error is happening in here:
For some reason the webgpu backend.init promise is rejecting due to the
null function or function signature mismatch
error. This is much further along than we were before though.
Could you share me the reproduce steps?
@fs-eire You'll need to run the webGPU setup in a chrome extension.
You can use my code I just published here: https://github.com/ggaabe/extension
run npm install
run npm run build
open the chrome manage extensions
load unpacked
select the build
folder from the repo.
open the AI WebGPU Extension
extension
type some text in the text input. it will load Phi-3 mini and after finishing loading this error will occur
if you view the extension in the extension in the extension manager and select the "Inspect views service worker" link before opening the extension it will bring up an inspection window to view the errors as they occur. A little "errors" bubble link also shows up here after they occur.
You will need to click the "Refresh" button on the extension in the extension manager to rerun the error because it does not attempt reloading the model after the first attempt until another refresh
@ggaabe I did some debug on my box and made some fixes -
Changes to ONNXRuntime Web:
env.wasm.wasmPaths
is not specified.Changes to https://github.com/ggaabe/extension
https://github.com/ggaabe/extension/pull/1 need to be made to the extension example, to make it load the model correctly. Please note:
tokenizer.apply_chat_template()
. However, the WebAssembly is initialized and the model loaded successfully.Other issues:
env.wasm.wasmPaths
to a CDN URL internally. At least for this example, we don't want this behavior so we need to reset it to undefined
to keep the default behavior.Worker
is not accessible in service worker. Issue tracking: https://github.com/whatwg/html/issues/8362Awesome, thank you for your thoroughness in explaining this and tackling this head on. Is there a dev channel version I can test out?
Not yet. Will update here once it is ready.
sorry to bug; is there any dev build number? wasn't sure how often a release runs
sorry to bug; is there any dev build number? wasn't sure how often a release runs
Please try 1.19.0-dev.20240621-69d522f4e9
@fs-eire I'm getting one new error:
ort.webgpu.bundle.min.mjs:6 Uncaught (in promise) Error: The data is not on CPU. Use `getData()` to download GPU data to CPU, or use `texture` or `gpuBuffer` property to access the GPU data directly.
at get data (ort.webgpu.bundle.min.mjs:6:13062)
at get data (tensor.js:62:1)
I pushed the code changes to my repo and fixed the call to the tokenizer. To reproduce, just type 1 letter in the chrome extension’s text input and wait
Hey, I also need this. I am struggling with importing this version. So far I have been importing ONNX using
import * as ort from "https://cdn.jsdelivr.net/npm/onnxruntime-web/dist/esm/ort.webgpu.min.js"
.
However, when I change to import * as ort from "https://cdn.jsdelivr.net/npm/onnxruntime-web@1.19.0-dev.20240621-69d522f4e9/dist/esm/ort.webgpu.min.js"
it seems not to have an .../esm/
folder. Do you know why that is and how to import it then?
Hey, I also need this. I am struggling with importing this version. So far I have been importing ONNX using
import * as ort from "https://cdn.jsdelivr.net/npm/onnxruntime-web/dist/esm/ort.webgpu.min.js"
. However, when I change toimport * as ort from "https://cdn.jsdelivr.net/npm/onnxruntime-web@1.19.0-dev.20240621-69d522f4e9/dist/esm/ort.webgpu.min.js"
it seems not to have an.../esm/
folder. Do you know why that is and how to import it then?
just replace .../esm/ort.webgpu.min.js
to .../ort.webgpu.min.mjs
should work. If you are also using service worker, use ort.webgpu.bundle.min.mjs
instead of ort.webgpu.min.mjs
.
@fs-eire I'm getting one new error:
ort.webgpu.bundle.min.mjs:6 Uncaught (in promise) Error: The data is not on CPU. Use `getData()` to download GPU data to CPU, or use `texture` or `gpuBuffer` property to access the GPU data directly. at get data (ort.webgpu.bundle.min.mjs:6:13062) at get data (tensor.js:62:1)
I pushed the code changes to my repo and fixed the call to the tokenizer. To reproduce, just type 1 letter in the chrome extension’s text input and wait
This may be a problem of transformerjs. Could you try whether this problem happen in a normal page? If so, can report the issue to transformerjs. If it's only happening in service worker, I can take a closer look
Describe the issue
I'm running into issues trying to use the WebGPU or WASM backends inside of a ServiceWorker (on a chrome extension). More specifically, I'm attempting to use Phi-3 with transformers.js v3
Every time I attempt this, I get the following error:
This is originating in the
InferenceSession
class injs/common/lib/inference-session-impl.ts
.More specifically, it's happening in this method:
const [backend, optionsWithValidatedEPs] = await resolveBackendAndExecutionProviders(options);
where the implementation is injs/common/lib/backend-impl.ts
and thetryResolveAndInitializeBackend
fails to initialize any of the execution providers.WebGPU is now supported in ServiceWorkers though; it is a recent change and it should be feasible. Here were the chrome release notes.
Additionally, here is an example browser extension from the mlc-ai/web-llm framework that implements WebGPU usage in service workers successfully: https://github.com/mlc-ai/web-llm/tree/main/examples/chrome-extension-webgpu-service-worker
Here is some further discussion on this new support from Google itself: https://groups.google.com/a/chromium.org/g/chromium-extensions/c/ZEcSLsjCw84/m/WkQa5LAHAQAJ
So technically I think it should be possible for this to be supported now? Unless I'm doing something else glaringly wrong. Is it possible to add support for this?
To reproduce
Download and set up the transformers.js extension example and put this into the background.js file:
Urgency
this would help enable a new ecosystem to build up around locally intelligent browser extensions and tooling.
it's urgent for me because it would be fun to build and I want to build it and it would be fun to be building it rather than not be building it.
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
1.19.0-dev.20240509-69cfcba38a
Execution Provider
'webgpu' (WebGPU)