Add ParametricAttention.v2

Description

This layer is an extension of the existing ParametricAttention layer, adding support for transformations (such as a non-linear layer) of the key representation. This brings the model closer to the paper that suggested it (Yang et al, 2016) and gave slightly better results in experiments.

Types of change

Feature

Checklist

[x] I confirm that I have the right to submit this contribution under the project's MIT license.
[x] I ran the tests, and all new and existing tests passed.
[x] My changes don't require a change to the documentation, or if they do, I've added all required information.

explosion / thinc

Add ParametricAttention.v2 #913

Description

Types of change

Checklist

Deploy request for thinc-ai pending review.

explosion / thinc

Add ParametricAttention.v2 #913

Description

Types of change

Checklist

👷 Deploy request for thinc-ai pending review.

Deploy request for thinc-ai pending review.