-
Please pose thoughtful questions for our speaker by Wednesday midnight, and upvote 5 by Thursday @ 10am, an hour before our session together. The associated papers are:
The following papers are ass…
-
attn_mask dtype error occurred as follows.
```
$ python3 scripts/txt2img.py --prompt "a professional photograph of an astronaut riding a horse" --ckpt "./768-v-ema.ckpt" --config configs/stable-dif…
-
I'm getting a `RuntimeError: CUDA error: an illegal memory access was encountered`
using FlashAttention with a GPT-NeoX-esque model. I
```
from transformers import AutoConfig
import torch
from…
-
When i try to fine tuning wizard-2 7b got the error: TypeError: MistralForCausalLM.forward() got an unexpected keyword argument 'causal_mask'.
Full stack info as follow:
model loading
==((====))=…
-
## Slides
Current size of slide sets (cap at 25, except Week 1). Revise readings, practice sessions and exercises, and include screenshots of videos when relevant.
- [x] 1. 37 -- OK, cap at ~ 40…
-
### What is the issue?
After ollama's upgrade to 0.27 from 0.20, it runs gemma 2 9b at very low speed. I don't think the OS is out of vram, since gemma 2 only costs 6.8G (q_4_0) vram while my lapto…
-
### System Info
```Shell
- `Accelerate` version: 0.33.0
- Platform: Linux-5.15.133+-x86_64-with-glibc2.35
- `accelerate` bash location: /opt/conda/bin/accelerate
- Python version: 3.10.14
- Numpy…
-
**Describe the bug**
Flash attention of both implementations from the original [one](https://github.com/HazyResearch/flash-attention/blob/main/flash_attn/flash_attention.py#L11-L71) or the torch.nn.f…
-
### System info
- `transformers` version: 4.43.0.dev0
- Platform: Linux-5.10.0-30-cloud-amd64-x86_64-with-glibc2.29
- Python version: 3.8.10
- Huggingface_hub version: 0.23.4
- Safetensors vers…
-
![class](https://cloud.githubusercontent.com/assets/1461453/21132332/ae2df6a2-c113-11e6-9cac-4a7a20151df9.png)
This structure will help us in keeping all the different distributions separate and th…