-
```
// For now, only half_t is supported. TF32 is WIP.
// Instantiate the function template for different HEADDIMS.
// For now, only half_t is supported. TF32 is WIP.
if (kHeadSize == 64) {…
-
### Describe the bug
XFormer will fail when passing attention mask with its last dimension not being a multiple of 8 (i.e. key's sequence length) under bfloat16. This seems to be because xformer ne…
-
请问 STLlama.py 第210行:
region_select_out = STE_out[:, :, region_start[0]:region_end[0], :].to(torch.bfloat16)
为什么索引 region_start[0] 和 region_end[0] 呢,一个batch中的所有样本,selected region 都相同吗?
-
### 🐛 Describe the bug
Hi! I am encountering the following error when using `torch.distributed.all_reduce` on bfloat16 tensors of a certain size using NCCL: `RuntimeError: CUDA error: misaligned ad…
-
Error indexing files: Error indexing files: MPS BFloat16 is only supported on MacOS 14 or newer
i am using macbook pro m1
-
Float16 or bfloat16 support for loads and stores is missing. I do not expect SIMD extensions to actually do operations on them in native format in hardware (but it would be nice to expose that and emu…
-
{'verbose': True, 'with_cuda': True, 'extra_ldflags': ['-L/home/junlong/anaconda3/envs/xlstm/lib', '-lcublas'], 'extra_cflags': ['-DSLSTM_HIDDEN_SIZE=128', '-DSLSTM_BATCH_SIZE=8', '-DSLSTM_NUM_HEADS=4…
-
```
python3 build.py --model_version v2_7b \
--model_dir ./model_files/Baichuan2-7B-Chat \
--dtype float16 \
--use_gemm_plugin float16 \
--use_gpt_attention_plugin float16 \
…
-
**Describe the bug**
I can't use `ttnn.divide` same way as `ttnn.multiply`. Multiply works as expected, divide crashes.
**To Reproduce**
```
import ttnn
import torch
import numpy as np
wit…
-
I have a Mistral7B model with fine-tuned LoRa weights with datatype bfloat16.
I ran into issues when attempting to use my adaptors which were compiled for bfloat16
Running the following command …