nvfuser Search Results - Githubissues

1000+ results
for nvfuser

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

pytorch/pytorch #75622

NVFuser incorrectly computes max for extremal values

``` In [2]: import torch In [3]: def fn(x): ...: x=x.amax(dim=0)*.1 ...: return x ...: ...: a=torch.tensor([-float('inf'), -float('inf'), -float('inf'), -float('inf')], de…

ngimel updated 2 years ago
2
yeyupiaoling/MASR #55

online 和offline自己炼的话相同条件下是不是offline效果好点？

2651084156 updated 1 year ago
67
csarofeen/pytorch #1877

Codegen Failure found in Bias+Gelu Backward Kernel with FP16…

### 🐛 Describe the bug The bias+gelu backward kernel which is really gelu_backward + an outer reduction for the bias is failing with the FP16 Data Type. This fusion is found in the backward pass of …

kevinstephano updated 2 years ago
1
ELS-RD/kernl #67

T5 support

We implement the support of T5 models. In this first step it's only kernel replacement, no other optimizations. - Support T5 masks #66 - Support T5 kernels patterns #63 - Support T5 missing act…

gaetansnl updated 2 years ago
1
csarofeen/pytorch #1510

Missing syncthreads

Some of the shift/gather tests are failing non-deterministically due to missing syncthreads. For example, in `FusionMaxPoolingStrided`, there must be syncthreads after loading to `T3`, but that's m…

naoyam updated 2 years ago
4
pytorch/pytorch #75088

NVFuser produces wrong outputs for extreme values in clamp

(and possibly other operations) ``` def fn(x): x=x.clamp(min=1.)*.1 return x a=torch.tensor([1.,float('inf'), 2., float('inf')], device="cuda") scripted = torch.jit.script(fn) fn(a) …

ngimel updated 2 years ago
2
csarofeen/pytorch #1518

Missing syncthreads with shift

This is related to #1510, which is fixed by #1516 for gather. The same error can happen with shift, e.g., ``` TEST_F(NVFuserTest, FusionValidateParallelizeShift_CUDA) { Fusion fusion; Fusion…

naoyam updated 2 years ago
1
pytorch/pytorch #75089

NVFuser takes a fallback with bfloat16 inputs on V100

``` def fn(x): x=x.clamp(min=1.)*.1 return x x=torch.randn(4, device="cuda", dtype=torch.bfloat16) scripted = torch.jit.script(fn) fn(x) with torch.jit.fuser("fuser2"): for _ in …

ngimel updated 2 years ago
1
allegroai/clearml-serving #36

Triton server keeps crashing

When i try to follow `examples/pytorch` the triton server is crashing, i.e. exiting with status code `-6`. this is the log from the container: ``` I1004 17:32:10.693691 41 grpc_server.cc:4375] St…

Aleksandar1932 updated 2 years ago
3
csarofeen/pytorch #1571

Predicate elimination bug

This logic seems wrong (blaming myself): https://github.com/csarofeen/pytorch/blob/devel/torch/csrc/jit/codegen/cuda/lower_predicate.cpp#L343-L348 ``` filters.emplace_back([this](Expr* expr) { …

naoyam updated 2 years ago
6

上一页 1...94 95 96 97 98 99 100...100 下一页

1000+ results for nvfuser

1000+ results
for nvfuser