Open drisspg opened 2 weeks ago
Adding sm90a to the base docker will also help Torchbench docker remove the current workaround: https://github.com/pytorch/benchmark/pull/2338
The concencus here from @ptrblck and @malfet was that we weren't comfortable with doing a full swap and instead, at least for pytorch there is only one translation unit that required sm90a: https://github.com/pytorch/pytorch/pull/129402 So we went with the more targeted change.
Synced with @ptrblck we should not merge this one in favor of https://github.com/pytorch/pytorch/pull/129402
Summary
Sm90a is needed to enable some features found in Cutlass. For some reason google as the worst SEO for finding more information about this gencode. This is the best description I have found: https://github.com/NVIDIA/cccl/issues/1270
I am not sure if we want to instead build for sm90 and sm90a ?
Pytorch Features requiring this change: https://github.com/pytorch/pytorch/pull/128989
cc @ptrblck @nWEIdia