Closed fdwr closed 6 months ago
This would be amazing! 🥳 Something desperately needed (and requested) for Transformers.js.
Feel free to test with any of the models I've already converted and put on the hub: https://huggingface.co/models?library=transformers.js&sort=trending. I have quite a few models with the external format; here are some popular ones which could help with development:
And of course the Llama-2-Onnx repo which I'm sure you're already aware of 😉
The kinks are being ironed out for WASM Memory64 in Emscripten and Chromium to support 4GB+ memory space
Hehe, you're welcome 😎 Chrome canary 118.0.5951.0 already ships with my fixes for 64bit memory and wasm threads
I've made a PR to load weights, but it needs some thoughts on overall implementation and another fix for emscripten because WASM MEMFS don't support files >2gb as they use ArrayBuffer (which has a 2gb limitation). And my hack with substituting internal file contents with WebAssembly.Memory instance does not work in release build because those fields and methods are obfuscated https://github.com/microsoft/onnxruntime/pull/17155
However, it's a bit useless until 64bit build support is merged as wasm will crash with out of memory on big models but I'll iron that out in next few weeks
fixed since ort-1.17.
Describe the issue
ONNX supports external data files for weights to exceed the 2GB ProtoBuf limit (e.g. model.onnx + weights.onnxdata), but ORTW doesn't support this, limiting it to smaller models (e.g. no SDXL or Dolly support). The kinks are being ironed out for WASM Memory64 in Emscripten and Chromium to support 4GB+ memory space, but without external data support, ORT won't be able to load these models. Note the .ort format has the same 2GB issue due to the FlatBuffers size limit.
To reproduce
Try to load any 2GB+ model, like SDXL or even Stable Diffusion 1.5 float32 with embedded weights (SD with float16 just nearly fits at 1.7GB) using ONNX Runtime for the Web.
Urgency
Sometime after November.
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.15
Execution Provider
'webgpu' (WebGPU) + webnn