Open jaindhairyahere opened 5 months ago
Inconsistent function description at https://github.com/google/trax/blob/master/trax/layers/attention.py#L330C1-L342C23
The function states that it "Returns attention-computed per-head activations and unchanged mask." but returns only the activations.
You are right. Can I raise PR to fix it ?
Description
Inconsistent function description at https://github.com/google/trax/blob/master/trax/layers/attention.py#L330C1-L342C23
The function states that it "Returns attention-computed per-head activations and unchanged mask." but returns only the activations.