Open gwleeee opened 11 months ago
https://arxiv.org/abs/2305.15542 https://github.com/bfshi/TOAST
UC Berkeley, Microsoft Research arxiv, under reivew
Top-Down Attention Steering
Step (i): bottom-up transformer (feed forward backbone)
Step (ii): Feature selection
Step (iii): FeedBack path
Step (iv): Self-attention with top-down input
TOAST를 위하여 두 단계로 모델을 학습
https://arxiv.org/abs/2305.15542 https://github.com/bfshi/TOAST
UC Berkeley, Microsoft Research arxiv, under reivew
Top-Down Attention Steering
Abstract
+other transfer learning method
LoRA: Low-Rank Adaptation of Large Language Models (ICLR 2022, Microsoft)
VPT: Visual Prompt Tuning (ECCV 2022)
TOAST: Top-down attention Steering
Preliminary: Transformer With Top-Down Attention
Step (i): bottom-up transformer (feed forward backbone)
Step (ii): Feature selection
Step (iii): FeedBack path
Step (iv): Self-attention with top-down input
Top-down Attention Steering
TOAST를 위하여 두 단계로 모델을 학습
Experiments
Visual classification
Language generation
Different model architecture and tasks
Parameter-Efficient TOAST
Limitation of TOAST