-
Hello, I have recently implemented a cross attention application with multi-modal fusion, but because the image resolution is too large, cuda OOM occurs when calculating q and k, so I found your paper…
-
I am wondering what's the best way to use efficient implementations of attention. PyTorch provides the experimental [`torch.nn.functional.scaled_dot_product_attention`](https://pytorch.org/docs/stable…
-
### Describe the bug
> RuntimeError: The size of tensor a (154) must match the size of tensor b (2304) at non-singleton dimension 1
### Reproduction
```python
# StableDiffusion3Pipeline
pipe.enab…
-
-
# 🐛 Bug
XFormers can not perform memory_efficient_attention,
## Command
## To Reproduce
The code is from attention_processors line 266 (it can change but not too much) of the diffusers libra…
-
I'm currently working on an API using FastAPI to serve DINOv2 models from the official DINOv2 repository. The API works well locally, but when I run it in a Docker container, I encounter an error rela…
-
### Suggestion Description
Started using torchlearn to train models in pytorch using my gfx1100 card but get a warning that 1toch was not compiled with memory efficient flash attention.
I see ther…
-
I am installing xformers by the following steps:
```bash
git clone https://github.com/ROCm/xformers
cd xformers
git checkout develop
git submodule update --init --recursive
python setup.py ins…
-
Namespace(confidence_threshold=0.2, config_file='configs/LVISCOCOCOCOSTUFF_O365_OID_VGR_SA1B_REFCOCO_GQA_PhraseCut_Flickr30k/ape_deta/ape_deta_vitl_eva02_clip_vlf_lsj1024_cp_16x4_1080k.py', input=['…
-
# ❓ Questions and Help
memory_efficient_attention fw produce inconsistent results
not sure what was going on? incorrect built? some specific versions combinations?
for some combinations:
xfo…