pytorch / builder

Continuous builder and binary build scripts for pytorch
BSD 2-Clause "Simplified" License
323 stars 213 forks source link

Update sm90 to -> sm90a #1878

Open drisspg opened 2 weeks ago

drisspg commented 2 weeks ago

Summary

Sm90a is needed to enable some features found in Cutlass. For some reason google as the worst SEO for finding more information about this gencode. This is the best description I have found: https://github.com/NVIDIA/cccl/issues/1270

I am not sure if we want to instead build for sm90 and sm90a ?

Pytorch Features requiring this change: https://github.com/pytorch/pytorch/pull/128989

cc @ptrblck @nWEIdia

xuzhao9 commented 1 week ago

Adding sm90a to the base docker will also help Torchbench docker remove the current workaround: https://github.com/pytorch/benchmark/pull/2338

drisspg commented 1 week ago

The concencus here from @ptrblck and @malfet was that we weren't comfortable with doing a full swap and instead, at least for pytorch there is only one translation unit that required sm90a: https://github.com/pytorch/pytorch/pull/129402 So we went with the more targeted change.

atalman commented 1 week ago

Synced with @ptrblck we should not merge this one in favor of https://github.com/pytorch/pytorch/pull/129402