Closed feevos closed 3 years ago
The layer in nn/layers/attention.py does not have multihead attention as described in the paper.
fixed it, closing issue
The layer in nn/layers/attention.py does not have multihead attention as described in the paper.