-
By running this simple test in python:
```
import pytest
import torch
import ttnn
@pytest.mark.parametrize("dims", [(32, 32), (64, 64)])
def test_add_with_block_sharding(device, dims):
torch.manu…
-
Hi, I am fine-tuning the Llama 3.1 8B model using the dolomite engine. I am getting the following error
```
[rank0]: Traceback (most recent call last):
[rank0]: File "", line 198, in _run_module_…
-
Hi I'm testing the local install & interface Dr. Furkan Gözükara made for Supir and its its working really well on a 4090 but i get the following error when i try to use it on an RTX8000.
RuntimeE…
-
Hi triton team,
I'm working on getting bfloat16 atomic add support in inductor to [support the backward of indirect loads](https://github.com/pytorch/pytorch/issues/137425). I'm looking at generati…
-
by default, torchtitan use FSDP2 mixed precision (param_dtype=bfloat16, reduce_dtype=float32)
for low-precision dtypes (float8 and int8), it's nature to compare loss curve with bfloat16 and see how…
-
I'm currently working with the code and I'm having some trouble OVERRIDE forward where the self.module.image_newline attribute is set or initialized in the model.
I've traced the model through the …
-
## 🐛 Bug
## To Reproduce
Here are two scripts for the experiment
test1.py
```
import torch
import torch_xla.core.xla_model as xm
import math
random_k = torch.randn((100, 100), dtype=…
-
I am running torchao: 0.5 and torch: '2.5.0a0+b465a5843b.nv24.09' on an NVIDIA A6000 ADA card (sm89) which supports FP8.
I ran the generate.py code from the benchmark:
python generate.py --c…
-
```
2023-10-23 10:49:30,409 WARNING: logs/HiFiSVC doesn't exist yet!
Global seed set to 594461
Using bfloat16 Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available:…
A-2-H updated
7 months ago
-
## Description
### Regression Test for Loss, Memory,
Throughput
Comparisons on loss, memory and throughput for Full-FT, PEFT
- QLoRA: status quo on the switch of `torch_dtype=float16` (Referenc…