microsoft / unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
https://aka.ms/GeneralAI
MIT License
20.14k stars 2.55k forks source link

retnet: pseudocode in the paper is inconsistent with Equation(7) #1213

Closed fanfanfan-hff closed 1 year ago

fanfanfan-hff commented 1 year ago

image image

is the cross_retention different with Cross-Chunk?

sunyt32 commented 1 year ago

Thanks for carefully checking the consistency of Equation(7) and pseudo code. There is an index mistake in Equation(7), where Cross-Chunk should be $Q{[i]}R{i-1}\odot \xi$. We will fix this in the next version of our paper.

okpatil4u commented 1 year ago

When are you planning to release the code ? The repo says in two days. Are you adhering to that timeline ?

donglixp commented 1 year ago

When are you planning to release the code ? The repo says in two days. Are you adhering to that timeline ?

@okpatil4u https://github.com/microsoft/torchscale/commit/bf65397b26469ac9c24d83a9b779b285c1ec640b