NVIDIA / Fuser

A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")
Other
271 stars 53 forks source link

Add NVFUSER_DUMP=python_definition_segments #3368

Closed jacobhinkle closed 2 weeks ago

jacobhinkle commented 2 weeks ago

This adds a new debug dump option that simply prints out all of the segmented fusion segments as python FusionDefinitions at compile time. This is useful for debugging non-segmentation related errors whose repros contain many segments. To do so, when you notice a compile error the printout will tell you which segmented group it was found in. Then run your code again with NVFUSER_DUMP=python_definition_segments which will print a definition function for each segmented group using the C++ to python translation that was recently added. This lets you create a more targeted repro by copying that smaller definition into the repro printed in the error message then updating the inputs so that it executes.

jacobhinkle commented 2 weeks ago

!build

jacobhinkle commented 2 weeks ago

I used this this morning with a little manual cleanup to narrow down the repro here: https://github.com/NVIDIA/Fuser/issues/871#issuecomment-2462425423.

jacobhinkle commented 2 weeks ago

Can we add a short section in https://github.com/NVIDIA/Fuser/wiki/Developer-guide#debug-nvfuser to describe how to use this 🙇

I will do that right now

EDIT: @jjsjann123 https://github.com/NVIDIA/Fuser/wiki/Developer-guide#debug-workflow-for-nvfuser