pytorch / nestedtensor

[Prototype] Tools for the concurrent manipulation of variably sized Tensors.
BSD 3-Clause "New" or "Revised" License
252 stars 28 forks source link

FX based fuser for Conv2d and ReLU #445

Closed cpuhrsch closed 3 years ago

cpuhrsch commented 3 years ago

Still doesn't fuse all Modules, but already a vast majority. To see real gains this needs at least cudnn8.2 (at least for fp16) and channels last input.