Open jlibovicky opened 7 years ago
In the upcoming API refactor (PR #509), we should distinguish between attention keys and values in the API, so they can be cached during ensembling independently on the particular attention model implementation.
This is more or less done..
Is it? I wouldn't call it done before there will be abstract methods keys and values in the BaseAttention class.
keys
values
BaseAttention
In the upcoming API refactor (PR #509), we should distinguish between attention keys and values in the API, so they can be cached during ensembling independently on the particular attention model implementation.