Open blepping opened 3 days ago
Hi @blepping
I already install like you said and got the node working but so far the speed is still the same, I test both SDXL and Flux with image size 1024x1024, i'm not sure whether the node work or not because the speed is the same. I get this notice when image finished generated:
Can you give me some examples json files to check whether this node work or not
Thank you
@wardensc2 thanks for giving it a try. i don't think there's really a way to do it wrong in the workflow.
attention improvements seem to make the most difference on large images. i didn't test with Flux (not sure if it uses the same kind of attentions or has compatible sizes). for my tests with SDXL, i got 8.94s/it with PyTorch attention and 6.71s/it using 4096x4096 resolution on a 4060Ti (about a 25% speed increase). the difference might not be big enough to see at small resolutions like 1024x1024. (think i might have been testing with smooth_k disabled - it didn't seem necessary with SDXL and should be a bit faster.)
i made a very simple ComfyUI node to replace the attention implementation with SageAttention: https://gist.github.com/blepping/fbb92a23bc9697976cc0555a0af3d9af
seems like a decent performance improvement on SDXL. SageAttention seems to fail when k/v aren't the same shape as q (on attn2 which i believe is cross-attention).
For SD15, none of the head sizes are currently supported so it doesn't do anything. not sure if you are interested in supporting SD15 (or SDXL cross-attentions). if any more information would be helpful, please let me know.
you can close this issue, just thought i would post this in case anyone wanted to try it with ComfyUI.
note: it's not a normal model patch, so to enable/or disable, make sure the node runs. simply bypassing or removing it won't work correctly.