mlc-ai / web-llm

High-performance In-browser LLM Inference Engine
https://webllm.mlc.ai
Apache License 2.0
13.33k stars 852 forks source link

Avoid reloading shards in different tabs #72

Closed loretoparisi closed 4 months ago

loretoparisi commented 1 year ago

When WebLLM Chat is loaded in two different tabs (same url) the System Initialize restarts, reloading the shards into memory:

[System Initalize] Fetching param cache[163/163]: 4020MB fetched. 100% completed, 10 secs elapsed. It can take a while when we first visit this page to populate the cache. Later refreshes will become faster.

Shards are loaded from browser application cache, butt they still need to be reloaded on each tab open. Is there any way to prevent this double loading, considering that each tab is on the same domain? Not sure if using Web Workers instead for shards loading (hence TVJM communication to the chat module via postMessage and onMessage) could be an alternative option and a solution.

tqchen commented 1 year ago

webworker is indeed useful, cc @DustinBrett i remember you also mentioned web worker, would love to see if we can collectively build a solution here

loretoparisi commented 1 year ago

webworker is indeed useful, cc @DustinBrett i remember you also mentioned web worker, would love to see if we can collectively build a solution here

That would indeed be a cool idea. Currently I have not been using cross tab/worker communication via a SharedWorker. My current setup is just making a new worker each time it's used. I've made some modifications to use globalThis in parts of the code, here is what I have now:

Thank you! Infact according to the SharedWorker docs

If SharedWorker can be accessed from several browsing contexts, all those browsing contexts must share the exact same origin (same protocol, host and port).

it should possible enabling communication between pages/tabs.

An interesting point is about global scope of the worker instance:

The shared worker will be alive as long as its global scope's owner set (a set of Document and WorkerGlobalScope objects) is not empty (for example, if there is any live page holding a reference to it, maybe through new SharedWorker()).

and worker's lifetime here.

A starting point could be this simple example.

DustinBrett commented 1 year ago

webworker is indeed useful, cc @DustinBrett i remember you also mentioned web worker, would love to see if we can collectively build a solution here

That would indeed be a cool idea. Currently I have not been using cross tab/worker communication via a SharedWorker. My current setup is just making a new worker each time it's used. I've made some modifications to use globalThis in parts of the code, here is what I have now:

NOTE: Re-adding comment as I commented with my wrong account by mistake.

tqchen commented 1 year ago

webworker support is added in main,I assume that we can adapt this to SharedWorker

tqchen commented 4 months ago

ServiceWorker is now supported