-
I'm encountering an issue with my Mali GPU. When I try to inference, I get the following error:
```
{'319cbd94-148d-4767-80af-950aa5c20-11'}}). Next partition:
Partition(node_id='319cbd94-148d-47…
-
```python
import numpy as np
import jax.numpy as jnp
a = jnp.array([jnp.nan], dtype=jnp.float32)
np.testing.assert_array_equal(a, a) # No error
a = jnp.array([jnp.nan], dtype=jnp.bfloat16)
…
-
## Description
### Regression Test for Loss, Memory,
Throughput
Comparisons on loss, memory and throughput for Full-FT, PEFT
- QLoRA: status quo on the switch of `torch_dtype=float16` (Referenc…
-
{'verbose': True, 'with_cuda': True, 'extra_ldflags': ['-L/home/junlong/anaconda3/envs/xlstm/lib', '-lcublas'], 'extra_cflags': ['-DSLSTM_HIDDEN_SIZE=128', '-DSLSTM_BATCH_SIZE=8', '-DSLSTM_NUM_HEADS=4…
-
ComfyUI_windows\python_embeded\Lib\site-packages\torch\nn\functional.py", line 2573, in layer_norm
return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled…
-
**Describe the bug**
I can't use `ttnn.divide` same way as `ttnn.multiply`. Multiply works as expected, divide crashes.
**To Reproduce**
```
import ttnn
import torch
import numpy as np
wit…
-
### 🚀 The feature, motivation and pitch
Consider implementing BFloat16 addition/subtraction operations with stochastic rounding, as it is critical for training large models with the BFloat16 optimi…
-
### Describe the issue:
We noticed that array + scalar promotion differs between numpy 1.X and 2.X in an unexpected way. For example:
```python
>>> import numpy as np
>>> import ml_dtypes
>>> n…
-
### Is there an existing issue for this?
- [X] I have searched the existing issues
### Describe the bug
{TypeError}TypeError("data type 'bfloat16' not understood")
![Screenshot from 2024…
-
In short, we observed `mixed_bfloat16` in TPU is slower than `float32` in our model benchmarks. Please refer to this [sheet](https://docs.google.com/spreadsheets/d/1TPwbe8p6eD61arkoIXQnPHf3rgFIDFUZCot…