bfloat16 Search Results

1000+ results
for bfloat16

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

microsoft/onnxruntime #22837

[Build] v1.20.0 min GCC version can't build on armv8

### Describe the issue The GCC version is checked to be at least version 9 in the `CMakeLists.txt`. https://github.com/microsoft/onnxruntime/blob/c4fb724e810bb496165b9015c77f402727392933/cmake/CMakeL…

AxelZi updated 5 days ago
1
pytorch/pytorch #139964

Report issue for torch.nn.Linear when forwarding a 3-dim ten…

### 🐛 Describe the bug Dear all, We seemly found a bug in nn.linear forwarding, here is a minimal example: ```python # import import torch import time # Set input size, output size, an…

shockline updated 4 days ago
5
huggingface/transformers #34702

FSDP with SFTrainer: expected dtype float for `end` but got …

### System Info pytorch 2.2 and 2.4 are tested. transformers 4.46.2 4 * A6000 ada ### Who can help? @muellerzr ### Information - [X] The official example scripts - [ ] My own modified script…

asc-raynor updated 6 hours ago
5
tenstorrent/tt-metal #14570

[Bug Report] Tilize/untilize for FLOAT32 and INT32

**Describe the bug** Can't create a FLOAT32 (as well as INT32) tensor with a shape that require tiling. ``` libc++abi: terminating due to uncaught exception of type std::runtime_error: TT_FATAL @ /ho…

rfurko-tt updated 1 week ago
5
pytorch/ao #1264

ZeroPointDomain as an arguments

## Context Current ZeroPointDomain is bound to the layout https://github.com/pytorch/ao/blob/2ba1a61fe1244560325b5051b5d3c10044553be0/torchao/quantization/quant_api.py#L607-L615 Ideally, we shoul…

airMeng updated 4 days ago
7
triton-lang/triton #4469

bfloat16 of fused attention seems have bug

both [fused-attention](https://triton-lang.org/main/getting-started/tutorials/06-fused-attention.html#sphx-glr-getting-started-tutorials-06-fused-attention-py) and [flash-attn-og](https://github.com/D…

NonvolatileMemory updated 3 weeks ago
4
pytorch-labs/attention-gym #83

Weird warning on compile: `SingleProcess AUTOTUNE benchmarki…

Here is the warning: ``` AUTOTUNE flex_decoding(1x128x1x32x128, 1x128x8x128, 1x128x8x128, 1x1x128x32, 1x1x128x32, 1x1x1, 1x1x1x1, 1x1x1, 1x1x1x1) triton_flex_decoding_1 0.0172 ms 100.0% BLOCKS_AR…

ViktorooReps updated 3 hours ago
2
NVIDIA/TensorRT-LLM #1957

Model Performance Degraded when using BFLOAT16 LoRa Adapters

### System Info 2X L4 GPUs Docker Image: nvcr.io/nvidia/tritonserver:24.06-trtllm-python-py3 ### Who can help? @juney-nvidia @kaiyux ### Information - [ ] The official example sc…

TheCodeWrangler updated 1 day ago
9
milvus-io/milvus #37602

[Bug]: [new_indexes] Create new HNSW_PRQ index time out for …

### Is there an existing issue for this? - [X] I have searched the existing issues ### Environment ```markdown - Milvus version: master-20241112-b5b00355-amd64 - Deployment mode(standalone …

binbinlv updated 5 days ago
6
tenstorrent/tt-metal #15156

Error in untilize_with_unpadding in Debug mode.

**Describe the bug** The code passes and gives correct output on Release mode but fails for Debug mode with the following error. Error here. https://github.com/tenstorrent/tt-metal/blob/8901511f737c…

shwetankTT updated 3 hours ago
3

上一页 1...4 5 6 7 8 9 10...100 下一页

1000+ results for bfloat16

1000+ results
for bfloat16