Open vicentowang opened 2 years ago
I found that layer norm is slower than the other norm mode at inference time.
I found that layer norm is slower than the other norm mode at inference time.