A concise but complete full-attention transformer with a set of promising experimental features from various papers
4.63k
stars
395
forks
source link
ContinuousTransformerWrapper: turning on absolute positional embedding: mirror TransformerWrapper #243
Closed
pfeatherstone closed 7 months ago
@pfeatherstone looks good! i'll fix that alibi / memory tokens issue to before my time is up