Open flatsiedatsie opened 2 months ago
- A question: the demo code has a lot of low-level stuff in it. To what degree is there an actual pipeline for this? Is there a simpler implementation than the Intel one?
It was discussed here, eventually decided that diffusers should be its own repo: https://github.com/xenova/transformers.js/pull/121
Basically @dakenf invested heaps of time to fix all kinds of issues in Chrome and ONNX Runtime itself etc. and finally got it to work. His project is here: https://github.com/dakenf/diffusers.js/
I was and still am super fascinated by @dakenf's project, but I had TypeScript build issues and just converted it all to JavaScript for quicker developing and not being strained by TS can't-build-this-self-pity-errors: https://github.com/kungfooman/StableDiffusion.js/
So you don't need any build step, just install the NPM deps and run example/react/app.html just as it is (powered by ESM and import maps).
However, it still doesn't "just work" - my biggest problem currently is Linux + NVIDIA since NVIDIA just can't get their **** together to have non-crashing FP16 WebGPU support. OTOH it works nicely on e.g. Linux + Surface Pro 9 or on Windows. So you hardware/driver constellation may or may not work.
Wow, you did a lot of work to make this possible. Kudos!
So are you saying the Intel code is based on your project? (they refer to a repo by Guschmue). Or is your code newer/better?
Linux + NVIDIA
Oh dear god.
Is falling back to FP32 on Linux a viable workaround?
Out of curiosity: do you think it might be possible to run Stable Diffusion V3 in the browser?
As said, @dakenf did all the hard work, I just ported a bit :sweat_smile:
Newer/better is always relative, if you want you could do performance benchmarks for example (I never tried any other SD web repo besides forks of diffusers.js)
I didn't see a SD3 repo yet, but SDXL seems to work here: https://github.com/gfodor/diffusers.js
I've just finished integrating this into my project. It made me once again appreciate how easy and consistent Transformers.js is to work with.
I'd still vote for creating a diffusion pipeline in Transformers.js too. Image generation is such a popular use case, and being able to also add that to a project with the same framework would - in my opinion - be valuable.
Compared to the WebLLM stable diffusion project that I had been using the speed difference is night and day: from 20 seconds per image to 4 seconds per image!
@kungfooman I looked into that repo, and tried to run it. However, it seems to require running the browser with special experimental flags so that Wasm 64 bit can be used. While cool, that made it less of a viable option.
Diffusion pipelines are of course very interesting, but also something you have to keep up with constantly... SDXL is old, now you want SD3. Tomorrow everyone is asking for Flux obviously and inpaint, img2img and images are not cool enough anymore, now we need Flux based video generation :sweat_smile:
I wish we had all the cool stuff in the browser, but currently there aren't enough people with enough time and actual porting skills it seems.
The point of this issue is that the code already exists. I have it running right here. And there was a PR for adding this to Transformers.js already, except that it wasn't committed.
Now that spin-off/fork seems to not really be a useful pathway (yet), since it requires a special webbrowser. So I'm suggesting the non-special implementation could still be integrated regardless. It seems like low-hanging fruit (but what do I know).
SDXL is old
But it would still be useful. It would be the first model of its (popular) type that Transformers.js supports.
Model description
Funnily enough this isn't really a request for a new model.
Prerequisites
Additional information
I discovered that Transformers.js can already run Stable Diffusion. It's in Intel's ONNX showcase.
I placed a quick demo based on their sample code here: https://flatsiedatsie.github.io/stable_diffusion_in_the_browser/
So this issue is two things:
If there isn't a pipeline, maybe it would be nice to implement one?
Your contribution
I hacked the sample code to run on Github Pages, and I will be implementing it in my project.