Open tera1k opened 10 months ago
https://arxiv.org/abs/2009.14794v4
Krzysztof Choromanski, Valerii Likhosherstov, David Dohan, Xingyou Song, Andreea Gane, Tamas Sarlos, Peter Hawkins, Jared Davis, Afroz Mohiuddin, Lukasz Kaiser, David Belanger, Lucy Colwell, Adrian Weller
FAVOR+:アテンションの演算を低ランク近似
0. 論文
https://arxiv.org/abs/2009.14794v4
Krzysztof Choromanski, Valerii Likhosherstov, David Dohan, Xingyou Song, Andreea Gane, Tamas Sarlos, Peter Hawkins, Jared Davis, Afroz Mohiuddin, Lukasz Kaiser, David Belanger, Lucy Colwell, Adrian Weller
1. どんなもの?
FAVOR+:アテンションの演算を低ランク近似![image](https://github.com/usersan/papers/assets/129807140/f9e15a15-0183-42bb-8ccf-2d62af45ce13)
2. 先行研究と比べてどこがすごい?
3. 技術や手法のキモはどこ?
4. どうやって有効だと検証した?
5. 議論はある?
6. 次に読むべき論文は?