Open naoyam opened 14 hours ago
Following @rdspring1's suggestion, I tried inserting return test_fn
at this line: https://github.com/NVIDIA/Fuser/blob/a18dbd292251bf04ef45b05fa39a945841ef9cd3/tests/python/utils.py#L346.
This seems to avoid the issue, indicating that this is a serde issue or maybe an issue just with this decorator. I did not need to use DEBUG_SERDE=true
or delete /tmp/nvfuser_kernel_db
for this.
Following @rdspring1's suggestion, I tried inserting
return test_fn
at this line:Line 346 in a18dbd2
. This seems to avoid the issue, indicating that this is a serde issue or maybe an issue just with this decorator. I did not need to use
DEBUG_SERDE=true
or delete/tmp/nvfuser_kernel_db
for this.
Tried the same test 100 times. No difference detected!
The diff test started to keep failing with some of the Python tests. For example:
https://dl.gitlab-master-pages.nvidia.com/pytorch/fuser-gh-mirror//nvfuser_github_ci/codegen_diff_p19742638_j118568737_1729887841468788908_codediff_896a28ad_b9203e1c_custom_command_20241025_131551.html
Ran
NVFUSER_DUMP=cuda_to_file pytest -v -k 'test_prim_layer_norm_fwd
10 times, and here are the number of kernels generated per each run:CC: @jacobhinkle, @xwang233 Related: #3260 #3280 #3256