-
Hi, I would like to ask about the Deformable Attention mechanism in the paper.
I went to the paper DEFORMABLE DETR: DEFORMABLE TRANSFORMERS
FOR END-TO-END OBJECT DETECTION and the Deformable Atten…
-
I was able to build flash-attention ROCM for both my Mi100 and Mi50 cards, but only got flash attention working on the Mi100(very impressive performance I might add).
Trying to run flash attention …
-
I would like to finetune CodeLlama-13b in a memory efficient way.
I was able to do it with CodeLlama-7b, but failing with 13b.
I can't load the model `unsloth/codellama-13b-bnb-4bit`:
```pyth…
-
MPlease note! ([MiUnlocks] (https://github.com/MoUnlocks)) a big liar! Do not have money trading! Then it will often change the name to cheat again, with a name like this (MiUnlocks/MMUnlock/MoUnlock…
CDSQ updated
3 months ago
-
See https://leanprover.zulipchat.com/#narrow/stream/458659-Equational/topic/Equation.205105.20-.3E.20Equation.202 for discussion. By default one should restrict attention to the "Core" graph of the i…
-
### Describe the bug
I accidentally introduced a bug in this [PR](https://github.com/huggingface/diffusers/pull/5181) by making a condition on [this line](https://github.com/huggingface/diffusers/blo…
-
Hello! @JunzheJosephZhu It's an excellent work of multimodal robot learning.
I'm confused about how to normalize the attention scores across all modalities. I would appreciate it if you could prov…
-
Hi. Thanks for the great work.
Why is the pos_enc in cross-attention only used for the keys, and not the queries? (see config files)
```
pos_enc_at_cross_attn_keys: true
pos_enc_at_c…
-
Hi,
I wonder if we can manually verify attention mask patterns during testing. While I can visualize masks by printing them as strings, I'm looking to add proper test assertions.
- How to assert…
-
When im trying to use Videocrafter 2 - i get this error :
F:\Pinokio\api\videocrafter2.git\app\env\lib\site-packages\torch\nn\functional.py:5560: UserWarning: 1Torch was not compiled with flash att…