bfloat16 Search Results

1000+ results
for bfloat16

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

ColfaxResearch/cutlass-kernels #11

Is bfloat16 going to be supportted?

``` // For now, only half_t is supported. TF32 is WIP. // Instantiate the function template for different HEADDIMS. // For now, only half_t is supported. TF32 is WIP. if (kHeadSize == 64) {…

hxdtest updated 3 months ago
1
huggingface/diffusers #9637

XFormer fails when passing attention mask while using bfloat…

### Describe the bug XFormer will fail when passing attention mask with its last dimension not being a multiple of 8 (i.e. key's sequence length) under bfloat16. This seems to be because xformer ne…

dhmbb2 updated 1 week ago
3
HKUDS/UrbanGPT #25

关于 region_start[0] 的疑问

请问 STLlama.py 第210行： region_select_out = STE_out[:, :, region_start[0]:region_end[0], :].to(torch.bfloat16) 为什么索引 region_start[0] 和 region_end[0] 呢，一个batch中的所有样本，selected region 都相同吗？

Xiao-congxi updated 1 week ago
1
pytorch/pytorch #119345

all_reduce misaligned address with bfloat16

### 🐛 Describe the bug Hi! I am encountering the following error when using `torch.distributed.all_reduce` on bfloat16 tensors of a certain size using NCCL: `RuntimeError: CUDA error: misaligned ad…

alexisVallet updated 5 days ago
6
PromtEngineer/localGPT-Vision #7

Error indexing files: Error indexing files: MPS BFloat16 is …

Error indexing files: Error indexing files: MPS BFloat16 is only supported on MacOS 14 or newer i am using macbook pro m1

jonykpi updated 1 week ago
1
tc39/ecmascript_simd #356

Float16 / bfloat16

Float16 or bfloat16 support for loads and stores is missing. I do not expect SIMD extensions to actually do operations on them in native format in hardware (but it would be nice to expose that and emu…

baryluk updated 5 years ago
2
NX-AI/xlstm #47

Loading extension module slstm_HS128BS8NH4NS4DBfDRbDWbDGbDSb…

{'verbose': True, 'with_cuda': True, 'extra_ldflags': ['-L/home/junlong/anaconda3/envs/xlstm/lib', '-lcublas'], 'extra_cflags': ['-DSLSTM_HIDDEN_SIZE=128', '-DSLSTM_BATCH_SIZE=8', '-DSLSTM_NUM_HEADS=4…

weiqiang13 updated 1 month ago
8
NVIDIA/TensorRT-LLM #592

Building engines for Baichuan and Qwen failed

``` python3 build.py --model_version v2_7b \ --model_dir ./model_files/Baichuan2-7B-Chat \ --dtype float16 \ --use_gemm_plugin float16 \ --use_gpt_attention_plugin float16 \ …

MIrandaStock updated 1 day ago
4
tenstorrent/tt-metal #12798

[Bug Report] Multiply and divide has different broadcasting …

**Describe the bug** I can't use `ttnn.divide` same way as `ttnn.multiply`. Multiply works as expected, divide crashes. **To Reproduce** ``` import ttnn import torch import numpy as np wit…

rfurko-tt updated 1 week ago
10
triton-inference-server/tensorrtllm_backend #403

Support bfloat16 LoRa Adaptors

I have a Mistral7B model with fine-tuned LoRa weights with datatype bfloat16. I ran into issues when attempting to use my adaptors which were compiled for bfloat16 Running the following command …

TheCodeWrangler updated 6 months ago
5

上一页 1...8 9 10 11 12 13 14...100 下一页

1000+ results for bfloat16

1000+ results
for bfloat16