-
## 🐛 Bug
This is a demo of my model.
I found that when reasoning the model established by the above code, an error will be reported if it is cycled three times. What is the reason?
**code:**
``…
-
We should have a validation pass to ensure that the input program makes sense. For example, the following program:
```python
from nvfuser import FusionDefinition, DataType
def nvfuser_fusion_id…
-
Group norm is calculated as:
```
x0 = [N, C, H, W]
x1 = x0.cast(fp32).reshape(N, C, H, W) --> (N, G, C/G, H, W)
x2 = x1 / x1.sum(C/G, H, W)
x3 = x2.reshape(N, G, C/G, H, W) --> (N, C, H, W)
x4…
-
FWIW, this test will be less fragile if you have a way to convert nvfuser warnings into true errors. Otherwise, if someone adds an unrelated warning that this test happens to exercise, this te…
-
### 🐛 Describe the bug
It takes about a minute to run this function for the first time. It takes only a second if it's running on a version of PyTorch built from source.
To reproduce run the cod…
-
### 🐛 Describe the bug
Within the following package:
https://download.pytorch.org/libtorch/cu118/libtorch-cxx11-abi-shared-with-deps-2.0.0%2Bcu118.zip
`libnvrtc-builtins.so.11.8` is not packaged …
-
`(tortoise) C:\Users\ash\Downloads\tortoise-tts>python tortoise/do_tts.py --text "I'm going to speak this" --voice random --preset fast
Traceback (most recent call last):
File "C:\Users\ash\Downlo…
-
It runs OK in GitHub CI, which runs with V100x4 and A100x4, but fails consistently on H100.
@csarofeen and I managed to reproduce this on `viking-prod-231` in partition `viking-prod-pjnl`.
```
$ g…
-
## 🐛 Bug
Gemma-7b with FSDP zero3 trained on 2 nodes with 8 H100 each gives OOM error for BS = 2 for both `thunder_cudnn` and `thunder_inductor_cat_cudnn`. The same configuration works for `inducto…
-
## Description
Running the code specified below I get a number of warning at the beginning, apparently harmless.
> Training: 0% |= | Accuracy: _, Softmax…