Closed khakhulin closed 5 years ago
Support compression of attention layer using Tucker decomposition
Support compression of attention layer using Tucker decomposition