bmm Search Results - Githubissues

1000+ results
for bmm

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

numpy/numpy #22604

how is numpy's `einsum` so much slower than other libraries?

### Describe the issue: `np.einsum` is ~20x slower than other libraries. ### Reproduce the code example: ```python import numpy as np x = np.random.uniform(size=(1000, 1, 500)) y = np.random.uni…

mattbarrett98 updated 1 year ago
18
atilaneves/dpp #349

dpp generated file cause `gdc` link error: multiple definiti…

dpp generated `liblfdsd.d` file cause `gdc` link error: multiple definition of `_D8liblfdsd14threadConsumerFOCQBc__T9queue_ummTiZQnZv'; /tmp/ccmsmGbO.o:liblfdsd.d:(.text+0x8a00) Both LDC and DMD wo…

mw66 updated 3 months ago
1
pytorch/pytorch #81307

[Prims+NvFuser] Non-fusible ops Tracker

Following is the complete list of ops that appear in the torchbench + huggingface + TIMM models, but are not included in the nvFuser fusion group. They are not included due to one (or more) of the …

SherlockNoMad updated 2 years ago
12
tenstorrent/tt-metal #12887

didt FF1 without GELU hang pcie alive

Based on FW80.10.4 bundle baseline didt testing [results](https://docs.google.com/spreadsheets/d/10uWtBEkLLEM-h5TuuGQ6HW8AjhXwSjFV-cK3iVFYYIU/edit?gid=1118664107#gid=1118664107), this issue will be us…

skrsmanovicTT updated 3 days ago
3
pytorch/pytorch #106991

Add cutlass as an alternative backend of PT2 Inductor

### 🚀 The feature, motivation and pitch ### Motivation [Cutlass](https://github.com/NVIDIA/cutlass) is an efficient template library for compute-heavy GPU operations like Gemm, Conv and others. It…

ipiszy updated 6 months ago
8
pytorch/pytorch #78109

Doesn't work when register hook to torch.nn.MultiheadAttenti…

### 🐛 Describe the bug I am not sure whether this issue should be a bug or a new feature reques. The problem is: when you register a hook to out_proj under MultiheadAttention, it will never be call…

B06901052 updated 5 months ago
10
pytorch/pytorch #106951

stride of gradient is not same as the corresponding tensor

### 🐛 Describe the bug When I tried to use torch.optimizer.Adam with fused=True, I got the following error: ```text File "/home/weixu/venvs/working/lib/python3.10/site-packages/torch/optim/adam…

emailweixu updated 1 year ago
2
facebookresearch/fairseq #3892

Simultaneous (MMA/waitk) inference bug in p_choose

## 🐛 Bug In the function [p_choose](https://github.com/pytorch/fairseq/blob/f6abcc2a67328bee8b15c596bb626ce2d720aae6/examples/simultaneous_translation/modules/monotonic_multihead_attention.py#L152)…

George0828Zhang updated 2 years ago
9
microsoft/KEAR #1

Performance on other PLM

Hello, Amazing work! Did you ever try other PLM(Bert,Roberta...) as your backbone model? Or did they perform not well in your preliminary experiments? Thanks so much

Hannibal046 updated 2 years ago
5
nod-ai/SHARK-ModelDev #214

how to convert any LLama model to MLIR format?

I have successfully executed the shark project using the llama large language model, and it works well. The model was sourced from shark_tank in MLIR format. I would like to run another large language…

louwangzhiyuY updated 10 months ago
6

上一页 1...39 40 41 42 43 44 45...100 下一页

1000+ results for bmm

1000+ results
for bmm