-
### Describe the bug
I tried to train the flux-dev model with Lora on A100 40GB. But it raises the CudaOutOfMemory exception.
### Reproduction
```
# Accelerate command
export MODEL_NAME="bl…
-
### Your current environment
```text
kaggle 2 t4s
```
### 🐛 Describe the bug
`WARNING: Casting torch.bfloat16 to torch.float16.
WARNING: Gemma 2 uses sliding window attention for every odd l…
-
Good evening,
I have tried looking for a solution in previous discussions issues and threads, with no luck - it could also be that I'm ignorant and could not recognize the issue and/or the solution i…
-
On an M2 Mac, I am getting the error shown below. I do have the fallback variable set properly:
```
% echo $PYTORCH_ENABLE_MPS_FALLBACK
1
```
I am using the following version:
```
Casano…
-
### 🐛 Describe the bug
The case comes from xpu [triton-benchmark](https://github.com/intel/intel-xpu-backend-for-triton/blob/llvm-target/benchmarks/triton_kernels_benchmark/flash_attention_fwd_benc…
-
### 🐛 Describe the bug
When running Roberta Question Answering (and also other Huggingface models) in CPU inference mode, I get an extra output returned by dynamo, that did not happen in a previous…
-
### Your current environment
```text
PyTorch version: 2.4.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS: Amazon Linux 2023.5.20240819 (x86_64)…
-
### Problem description
The problematic `pixi.toml` that cannot be successfully resolved with `pixi=0.24.2`:
```
[project]
name = "pixi-faiss"
version = "0.1.0"
description = "Add a short desc…
-
## Description
Runtime won't load a converted model for bge-m3
### Expected Behavior
No errors.
### Error Message
Exception in thread "main" java.lang.RuntimeException: data did not match any…
-
`sdpa_ex` implementation of `torch.nn.functional.scaled_dot_product_attention` returns all output tensor proxy in trace to be on `cuda` but at runtime some outputs are on `cpu`.
Repro
```python
i…