was-attention Search Results

1000+ results
for was-attention

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

keras-team/keras-hub #1613

GemmaBackbone.get_layout_map broken for gemma_2b_en

**Describe the bug** When attempting to shard a `gemma_2b_en` model across two (consumer-grade) GPUs, I get: ``` ValueError: One of device_put args was given the sharding of NamedSharding(mesh=…

josharian updated 1 week ago
5
lyuwenyu/RT-DETR #373

Deformable Attention

Hi! Thank you for your great work. I was looking at the code and I see that deformable attention is only used in the cross-attention Decoder module. Why is deformable attention not used anywhere e…

janmarczak updated 1 month ago
3
pinokiocomputer/pinokio #220

Videocrafter 2 1Torch error

When im trying to use Videocrafter 2 - i get this error : F:\Pinokio\api\videocrafter2.git\app\env\lib\site-packages\torch\nn\functional.py:5560: UserWarning: 1Torch was not compiled with flash att…

fremalm updated 1 week ago
1
joshyZhou/AST #8

about encoder design

Hello, I saw a paragraph in the paper that simply stated that the attention operation was omitted in the encoder module, so the encoder only consists of FFN layers. Here, we omit the attention mech…

zhaozhaoooo updated 1 week ago
1
BeamNG/Blender-JBeam-Editor #25

residual wireframe

I was doing some editing of a jbeam, then started a new project with the f4 menu, and this remained. Not a big issue, it seems that doing basically anything clears it, but I thought I'd bring it to yo…

fillman86 updated 2 weeks ago
1
Jiangbo-Shi/SGMF #2

Attention heatmaps

Hello, when I was building attention heatmaps, I found that the attention scores across different patches did not vary much. Have you encountered this problem before?

Noirombre updated 3 weeks ago
1
pytorch/pytorch #136694

`slow` workflow has been broken for 4+ weeks

### 🐛 Describe the bug According to https://github.com/pytorch/pytorch/actions/workflows/slow.yml?query=is%3Asuccess last successful run of the workflow on main branch was on Aug 20th for https://gi…

malfet updated 1 day ago
4
mbortolon97/6dgs #7

the role of cam_up_similarity

Hi, thank you for your wonderful work! I notice that in the training process, the loss function is composed of the attention loss and the cam_up_similarity. Since the cam_up_similarity was not discuss…

yuyouxixi updated 3 days ago
1
fedora-infra/tahrir #533

<small> HTML tag visible in badge awarded notification

When scanning a QR code to get a badge (while signed into FAS) i got this success notification that the badge was awarded ![1000002207](https://github.com/user-attachments/assets/f1d09e2b-4f33-4bba…

MoralCode updated 2 days ago
1
vllm-project/vllm #7315

[Feature]: Support attention backend with FlexAttention

### 🚀 The feature, motivation and pitch FlexAttention was proposed as a performant attention implementation leveraging `torch.compile` with easy APIs for adding support for complex attention varian…

mgoin updated 1 month ago
3

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for was-attention

1000+ results
for was-attention