Open bramus opened 15 hours ago
The proper way to fix this would be to add an option upstream to MediaPipe so it optionally caches the model. This would have to happen here:
A way to fix this just here for this project is to fetch the model out-of-bounds, store it in the Cache API, and then pass a blob URL to LlmInference.createFromOptions()
as I do in this project:
Agree that a warning would make sense, too.
I tried some of the
web-ai-demos
on https://chrome.dev/, such as https://chrome.dev/web-ai-demos/perf-client-side-gemma-worker/Some demos say that the model will take about 30s or 1 minute to load. This took longer, as it turned the demo was downloading a model … of more than 1GB … which eventually took 15 minutes to complete.
Please add a warning message as per https://web.dev/articles/client-side-ai-performance#signal_large_downloads guidelines.
From the looks of it, the model doesn’t get cached on disk properly, so people end up downloading the model over and over again.