nvfuser Search Results - Githubissues

1000+ results
for nvfuser

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

pytorch/pytorch #76107

test_native_layer_norm_bfloat fails with "Non-divisible spli…

### 🐛 Describe the bug Running the following test: ``` python -m unittest test_jit_cuda_fuser.TestCudaFuser.test_native_layer_norm_bfloat -v ``` Results in the error: ``` ==============…

casparvl updated 2 years ago
7
csarofeen/pytorch #1319

Sorting error observed in NVFuser

## 🐛 Bug Conversation from Horace: ``` hmmm... I think it has something to do with the previous sort error I mentioned. Horace He 3 days ago I'm writing a test minimizer, and that error st…

kevinstephano updated 2 years ago
4
pytorch/pytorch #52286

JIT makes model run x14 times slower in pytorch nightly

## 🐛 Bug JITed model runs ~ x14 slower for its first ~ 20 evaluations in pytorch nightly, with respect to stable torch, 1.7.1 ## To Reproduce Run the following code in pytorch: ``` import…

IgnacioJPickering updated 2 years ago
30
csarofeen/pytorch #1514

Expand indexing error causes incorrect results

### 🐛 Expand indexing error causes incorrect results It seems we are indexing into the expanded tensor incorrectly. The repro works correctly when the expand is dropped and we treat the size-1 axis…

rdspring1 updated 2 years ago
4
pytorch/pytorch #75464

Strict torch.jit.script fusion mode

### 🚀 The feature, motivation and pitch Not easily knowing when something is fused has bitten the e3nn team quite a bit I think. Turns out control flow prevented all their [ridiculously fuseable co…

cpuhrsch updated 2 years ago
7
pytorch/pytorch #82343

PR #81785 causes inference start up overhead on timm models.

### 🐛 Describe the bug PR #81785 seems to have increased start up overhead for timm models significantly and has timed out our internal CI. Overall throughput is about the same, but end-2-end test…

jjsjann123 updated 2 years ago
5
csarofeen/pytorch #995

Indexing issue with block extent being "bound" to multiple v…

If compute at maps don't get the full relationships of thread bindings, they don't index correctly because that information isn't otherwise available in indexing math. For example if we have `blockIdx…

csarofeen updated 2 years ago
3
csarofeen/pytorch #1391

Vector load from SMEM

Vector load from SMEM seems to have a problem. ``` TEST_F(NVFuserTest, FusionSmemVectorize_CUDA) { Fusion fusion; FusionGuard fg(&fusion); auto tv0 = makeContigTensor(1); fusion.addI…

naoyam updated 2 years ago
3
csarofeen/pytorch #1311

NVFuser should autopromote scalar tensors to GPU to achieve …

## 🚀 Feature Forwarding feature request from Horace@functorch ``` def f(x): x = x * torch.tensor(1.0) x = x * torch.tensor(1.0) return x ``` Runs fine in eager/TS, but NVFuser fail…

jjsjann123 updated 2 years ago
3
csarofeen/pytorch #1579

Warp reduction bug

This seems to be wrong: https://github.com/csarofeen/pytorch/blob/devel/torch/csrc/jit/codegen/cuda/runtime/warp.cu#L47 ``` if (read_write_pred && is_warp_head) { shared_mem[smem_offset…

naoyam updated 2 years ago
1

上一页 1...94 95 96 97 98 99 100...100 下一页

1000+ results for nvfuser

1000+ results
for nvfuser