pytorch / nestedtensor

[Prototype] Tools for the concurrent manipulation of variably sized Tensors.
BSD 3-Clause "New" or "Revised" License
252 stars 28 forks source link

More efficient MHA - use CPU numel and improve benchmark #405

Closed cpuhrsch closed 3 years ago