attention Search Results

1000+ results
for attention

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

google-deepmind/kfac-jax #244

Compatibility with `pallas` attention

First, thank you for creating and releasing this invaluable resource. # What I am trying to do I would like to combine `kfax-jax` with [fused attention from `pallas`](https://github.com/google/…

ae-foster updated 3 days ago
6
nod-ai/SHARK-Turbine #669

[attention] Extend attention to fuse transpose

antiagainst updated 1 week ago
8
SHI-Labs/NATTEN #144

Linear Attention

Hi, Thank you for your great work! It's really helpful in my research. I'm interested in using NATTEN with linear attention, which can be simplified as ```(q@k) @ v -> q@(k@v)```. This approach …

LMMMEng updated 4 weeks ago
1
longgui0318/comfyui-magic-clothing #3

Attention issue

Hi, I got this issue on mac: ``` /custom_nodes/comfyui-oms-diffusion/oms_diffusion_nodes.py", line 149, in get_area_and_mult conditioning["c_attn_stored_area"] = AttnStoredExtra(torch.te…

alexgenovese updated 1 month ago
5
flashinfer-ai/flashinfer #369

Feature: Flash Attention 3

https://research.colfax-intl.com/flashattention-3-fast-and-accurate-attention-with-asynchrony-and-low-precision/ cc @yzh119

zhyncs updated 1 week ago
3
axolotl-ai-cloud/axolotl #1774

Migrate multipack to refactored flash attention

### ⚠️ Please check that this feature request hasn't been suggested before. - [X] I searched previous [Ideas in Discussions](https://github.com/axolotl-ai-cloud/axolotl/discussions/categories/ideas) …

casper-hansen updated 18 hours ago
1
Dao-AILab/flash-attention #1071

BF16 Flash Attention producing incorrect values compared to …

Repro ``` import flash_attn import torch from einops import rearrange def snr(a: torch.Tensor, b: torch.Tensor): if torch.equal(a, b): return float("inf") if a.dtype == t…

JerrickLiu updated 1 day ago
3
NVIDIA/TensorRT-LLM #1921

attention fp8 compute type

when we use fp8 data type , we found ffn gemm/atten prj support real fp8 comute(this is supported on H20、L20), but Q*transopse(Key) or softmax * value in attention dosen't support fp8 compute, …

enozhu updated 1 week ago
4
cmsflash/efficient-attention #13

efficient-attention applly in Cross Attention

Hello, I have recently implemented a cross attention application with multi-modal fusion, but because the image resolution is too large, cuda OOM occurs when calculating q and k, so I found your paper…

stanny880913 updated 1 month ago
1
dxos/dxos #7264

[composer] attention jumps from stack

If document is open in a plank and stack attention jumps from the stack back to the standalone plank https://github.com/user-attachments/assets/5f81904d-c0ea-46a3-89db-dd715143fd53

wittjosiah updated 4 days ago
1

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for attention

1000+ results
for attention