NVIDIA / Fuser

A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")
Other
271 stars 53 forks source link

Add `enable_options` and `disable_options` to `fd.execute` #3270

Closed Priya2698 closed 1 week ago

Priya2698 commented 4 weeks ago

This PR adds enable_options and disable_options to fd.execute to allow setting them through the python frontend in lieu of the environment variables. This work will be used to allow enabling nvfuser matmul codegen from within Thunder.

Inspired by @jacobhinkle's PR #1905!

Tracking Issue: #3022

Priya2698 commented 4 weeks ago

!build

jacobhinkle commented 4 weeks ago

Thanks for taking the initiative with this! We will also need to update FusionExecutorCache to be option-aware so that we don't re-use kernels compiled with different options. That was part of the idea with #2077: separating options (features) into ones that affect each level of caching or not.

Priya2698 commented 3 weeks ago

Thanks for taking the initiative with this! We will also need to update FusionExecutorCache to be option-aware so that we don't re-use kernels compiled with different options. That was part of the idea with #2077: separating options (features) into ones that affect each level of caching or not.

Got it -- we can add that support separately, what do you think? That will allow development on https://github.com/NVIDIA/Fuser/issues/3022. IIUC, we are currently missing this for the environment variables based approach as well so we may be reusing kernels which used different env variables?

jacobhinkle commented 3 weeks ago

IIUC, we are currently missing this for the environment variables based approach as well so we may be reusing kernels which used different env variables?

I guess this is true: a process can change its env vars at any time and we will not recognize that change after the first time we look up an option. However that is off-label use. If we are providing a kwarg to execute then a user would probably assume it's fully supported.

I wonder if we could do something quickly like use a separate FusionCache for each set of options we receive in the frontend. If no options are provided we'd use the default. That could be done entirely in python.

Priya2698 commented 3 weeks ago

I wonder if we could do something quickly like use a separate FusionCache for each set of options we receive in the frontend. If no options are provided we'd use the default. That could be done entirely in python.

What do you mean by using separate FusionCache?

A fusion cache is initialized here: https://github.com/NVIDIA/Fuser/blob/f6975f37eab197052e7ee59bf2bc8c78c1491dbf/csrc/python_frontend/fusion_definition.cpp#L58-L67. The options are passed in the execute method so they are not available when the fusion definition is initialized.

I am still going through the FusionCache files.

jacobhinkle commented 3 weeks ago

. The options are passed in the execute method so they are not available when the fusion definition is initialized.

Hmmm. Yeah I guess you're right. FusionCache is not the right place to do this. Instead we might do it by having a separate FusionExecutorCache might work: https://github.com/NVIDIA/Fuser/blob/5db18de5d8dafc6a8f1e76909d73799753fe7a5e/csrc/python_frontend/fusion_definition.cpp#L387-L390 Here scheds->auto_gen_schedules is a unique_ptr<FusionExecutorCache>. This is fine as a default but when features are supplied, we should create a new cache and use it instead I think.

Priya2698 commented 3 weeks ago

!build

Priya2698 commented 3 weeks ago

!build

Priya2698 commented 1 week ago

!test

Priya2698 commented 1 week ago

!test --serde

Priya2698 commented 1 week ago

!test

Priya2698 commented 1 week ago

!test