was-attention Search Results

1000+ results
for was-attention

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

vllm-project/vllm #7315

[Feature]: Support attention backend with FlexAttention

### 🚀 The feature, motivation and pitch FlexAttention was proposed as a performant attention implementation leveraging `torch.compile` with easy APIs for adding support for complex attention varian…

mgoin updated 1 month ago
8
lyuwenyu/RT-DETR #373

Deformable Attention

Hi! Thank you for your great work. I was looking at the code and I see that deformable attention is only used in the cross-attention Decoder module. Why is deformable attention not used anywhere e…

janmarczak updated 3 weeks ago
4
pytorch/torchtitan #678

Any suggestion for Llama-3.1-70b(128k seq len) deploy mesh w…

Under the 128k long sequence, the activation value memory increases significantly. CP8 + TP8 seems necessary (they reduce the activation value memory almost linearly), but there is still as much as …

medivh-xp updated 19 hours ago
6
kohya-ss/sd-scripts #1720

Enabling dim_from_weights or loraplus_unet_lr_ratio will cau…

Hi, Today, when I was running LoRA training for the `Flux.1` model (sd-scripts on SD3's breach), the "`train_blocks must be single for split mode`" error suddenly occurred. This error had not appea…

avan06 updated 1 week ago
1
pytorch/pytorch #112997

Add support for Flash Attention for AMD/ROCm

### 🚀 The feature, motivation and pitch Enable support for Flash Attention Memory Efficient and SDPA kernels for AMD GPUs. At present using these gives below warning with latest nightlies (torch=…

chauhang updated 3 weeks ago
9
fudan-generative-vision/hallo #199

ERROR - root - Failed to execute the training process:

Thank you very much for your excellent work. I am now encountering this problem while training my model in a virtual environment. When I execute the command line, an error occurs. Can anyone solve it…

torracxiaokeai updated 1 month ago
3
MooreThreads/Moore-AnimateAnyone #162

Stage2 RuntimeError: The size of tensor a (22) must match th…

~/# accelerate launch train_stage_2.py --config configs/train/stage2.yaml The following values were not passed to `accelerate launch` and had defaults used instead: `--num_processes` was set…

FangSen9000 updated 1 month ago
1
Larian77/Openbiblio #68

search filter and questions

Hi, I saw that the demo project has a search filter in the opac, I thought it was really cool but I would like to know if this is new, exclusive to this version, or if I can do it with previous ver…

Elton8787 updated 1 day ago
3
modelscope/DiffSynth-Studio #95

我运行examples中的diffutoon_toon_shading.py卡在第三步等了很久也不执行，是什么原因？

D:\DiffSynth-Studio-main\diffsynth\models\attention.py:54: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at ..\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:4…

music1999 updated 5 days ago
4
mattiasw/ExifReader #442

Please support writing exif back to image files too

### Description This package looks like a great step up from `piexifjs` in many ways. Better API, supports more image formats (awesome!). But unlike `piexifjs` it does not support writing the Exif …

jim-meyer updated 5 days ago
2

上一页 1...3 4 5 6 7 8 9...100 下一页

1000+ results for was-attention

1000+ results
for was-attention