gwleeee / PaperReview

0 stars 0 forks source link

TOAST: Transfer Learning via Attention Steering #16

Open gwleeee opened 11 months ago

gwleeee commented 11 months ago

https://arxiv.org/abs/2305.15542 https://github.com/bfshi/TOAST

UC Berkeley, Microsoft Research arxiv, under reivew

Top-Down Attention Steering

Abstract


+other transfer learning method

LoRA: Low-Rank Adaptation of Large Language Models (ICLR 2022, Microsoft)

image

VPT: Visual Prompt Tuning (ECCV 2022)

image

image

TOAST: Top-down attention Steering


image

Preliminary: Transformer With Top-Down Attention

Step (i): bottom-up transformer (feed forward backbone)

Step (ii): Feature selection

image

Step (iii): FeedBack path

Step (iv): Self-attention with top-down input

Top-down Attention Steering

TOAST를 위하여 두 단계로 모델을 학습

Experiments


Visual classification

image image

Language generation

image

Different model architecture and tasks

image

image

image

Parameter-Efficient TOAST

image

Limitation of TOAST

image