This test uses pseudo random numbers to generate index tensors, which can result in requiring too large grid dimensions. For instance, there was this error reported today:
C++ exception with description " INTERNAL ASSERT FAILED at "/opt/pytorch/nvfuser/csrc/runtime/executor_params.cpp":41, please report a bug with repro script to NVFuser at https://github.com/NVIDIA/Fuser/issues. Invalid number of blocks in y direction: 69923
Exception raised from assertValid at /opt/pytorch/nvfuser/csrc/runtime/executor_params.cpp:41 (most recent call first):
The true fix would be making sure the scheduler to use a proper launch configurations, but these index operations are only there as experimental ops, I think this fix should be good enough for now.
This test uses pseudo random numbers to generate index tensors, which can result in requiring too large grid dimensions. For instance, there was this error reported today:
The true fix would be making sure the scheduler to use a proper launch configurations, but these index operations are only there as experimental ops, I think this fix should be good enough for now.