-
This is from the given example in the repo:
```python
import torch
import flashinfer
device_id = 1
kv_len = 2048
num_kv_heads = 32
head_dim = 128
k = torch.randn(kv_len, num_kv_heads, …
-
Hi,
Thank you for providing this collection! I'm trying to get local window attention to run. I managed to have a simple example running locally as shown in #15, but I am facing problems now when …
-
Traceback (most recent call last):
File "/AI/videoDetection/algorithm/PTSEFormer-master/tools/train.py", line 261, in
main()
File "/AI/videoDetection/algorithm/PTSEFormer-master/tools/trai…
-
Hi,
i encounter the following error message trying to enable flash attention when running the command below. Can i know is flash attention supported ?
``command: ./main -m $model -n 128 --prompt …
-
Greetings from the Certbot Team,
With Python 3.8 reaching its end-of-life (see https://devguide.python.org/versions/#supported-versions), we have to update our Certbot snaps to a newer Python versi…
-
Hi @fradif96
We don't have new modules for cross/self-attention. It's the same attention layers but just reshape the latent features from ((b t) l d) -> (b (t l) d) [here](https://gith…
-
Hello,
Please let me know how do I run Moondream2 using Flash Attention 1 since am trying to run it on kaggle or colab using t4 gpus so flash attention 2 won't work.
You have just mentioned to use f…
-
**Is your feature request related to a problem? Please describe.**
When i feed the `out_dim` argument in `__init__` in [Attention block](https://github.com/huggingface/diffusers/blob/b69fd990ad8026f…
-
aria-keyshortcuts property
https://www.w3.org/TR/wai-aria-1.3/#aria-keyshortcuts
The internationalization (I18N) working group reviewed 1.3 as part of horizontal review. In the course of doing thi…
-
Currently the autocomplete shows all config keywords all the time. Limit the available autocomplete keywords to the keywords in the current "mode" (as described in the Cisco documentation).
Pay att…