-
`---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
[](https://localhost:8080/#) in ()
19
…
-
## Description:
Hello! I’ve been following the development of this repository and appreciate the efforts to benchmark various efficient Transformer variants. I’d like to propose the implementation of…
-
I downloaded `nvidia/Llama3-ChatQA-1.5-8B` manually from HF into local. I ran `scripts/convert_hf_checkpoint.py` Then I wanted to run generate.py using the local checkpoint dir:
` raise RuntimeE…
-
I am writing to seek your assistance with an issue I've encountered while running the "04-VelocityBasics" on my local machine. Upon executing the associated diagram, I've noticed that the scatter plot…
-
I tried to imitate your educational coding style hehe
Here's a pure Pytorch implementation of Flash Attention, hope you like it @karpathy
```
def flash_attention(Q, K, V, is_causal=True, BLOCK_S…
ghost updated
1 month ago
-
Hello!
The `main` (`a441a3f`) branch of the AQLM repository does not support `flash attention 2`. The error occurs because QuantizedWeight does not have a weight attribute ([closed issue #31](https…
-
!pip install -U airllm
!pip install -U bitsandbytes
!pip install git+https://github.com/huggingface/transformers.git
!pip install git+https://github.com/huggingface/ac…
-
Hi,
Thank you for providing this collection! I'm trying to get local window attention to run. I managed to have a simple example running locally as shown in #15, but I am facing problems now when …
-
Hi, thank you for your awesome works. However, when I was trying to run the M3DClip model using code on huggingface I have some errors related to the einops lib. I noticed you use the monai ViT layers…
-
### System Info
```
pip install git+https://github.com/huggingface/transformers.git
pip install tokenizers==0.20.0
pip install accelerate==0.34.2
pip install git+https://github.com/huggingface/tr…