Closed Naomiusearch closed 1 month ago
Fixed naive attention and made exl2 work on amd :3
Thanks for the PR. There's been some quantization refactors following #454, but there won't be much work needed for this PR. I'll resolve the conflicts myself.
Fixed naive attention and made exl2 work on amd :3