NVIDIA / Fuser

A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")
Other
271 stars 53 forks source link

Add dtype parameter to Tensor state in python frontend. #3269

Open rdspring1 opened 4 weeks ago

rdspring1 commented 4 weeks ago

Currently, we only track the number of dimensions in the Tensor struct. Tracking the dtype of the tensor would also be useful information to track.

Reference: https://github.com/NVIDIA/Fuser/blob/main/csrc/python_frontend/fusion_definition.h#L31-L75

kevinstephano commented 2 weeks ago

The Tensor struct in the python frontend is not the same as a TensorView in nvFuser's IR that has more information. Therefore, what is the motivating use case to add dtype to the struct?

rdspring1 commented 2 weeks ago

You can perform heuristic and segmentation analysis without building the cpp Fusion IR.

You wanted to do segmentation in python. How you do better than the CPP algorithm without additional information?

You already have the fusion DAG, but you need the tensor sizes and dtype information to score the segments. Most device information is already available through pytorch.

rdspring1 commented 2 weeks ago

Segmentation decomposes a fusion into a directed acyclic graph (DAG) of sub-fusions. You can map a fusion directly to its component sub-fusions without building the CPP Fusion IR.

Aditya-PS-05 commented 1 week ago

@rdspring1 , I would like to work on it. Please assign it to me or should I just create a pr.