Loading LLM inside Electron window is very slow at the Compiling GPU Shader on Windows

Hey loving the project, really cool stuff.

Ran into an issue while trying to wrap electron around web-llm. After the model params are loaded, it seems to get stuck for several minutes (6-10) at

Loading GPU shader modules[73/74]: 98% completed, 3 secs elapsed.

I'm not seeing any error messages, and the LLM does eventually load, but its stuck there for while, even on smaller model (LLAMA 3.2 1B). We're seeing this both with our own locally served application (not using a service worker), as well as just pointing electron to chat.webllm.ai (assuming a service worker from the console logs).

We've verified that the high-performance GPU (RTX 5000) is being used by electron, both by checking navigator.gpu.requestAdapter, and by task manager. Also both our own locally served application and chat.webllm.ai work completely perfectly using standard browsers. Edge, Chrome, Brave, and Chromium 132 all load extremely fast, so we don't think is an OS/driver/hardware issue, more probably something in Electron, but I'm asking here in the hopes that someone can point me in the right direction to debug why this is happening by digging a bit deeper.

This is only on windows, mac works completely fine.

System information

Windows 11 RTX 5000 Drivers - NVIDIA 550 and 566 both showing same issue

Electron dependency versions: chrome-version : 130.0.6723.59 node-version : 20.18.0 electron-version : 33.0.2

Repro gist -

https://gist.github.com/StevenHanbyWilliams/b8bd2f41fcaef13b9f61db5be3a9e65d

WebGPUReport.org output

Note: Weirdly electron webgpu doesn't support shader-f16 on windows, as detailed here https://github.com/electron/electron/issues/43567, but we have support for it in Chromium/Edge/Brave/Chrome

Screenshot 2024-10-30 175739

mlc-ai / web-llm

Loading LLM inside Electron window is very slow at the Compiling GPU Shader on Windows #621