issues
search
owenliang
/
mnist-dits
Diffusion Transformers (DiTs) trained on MNIST dataset
42
stars
10
forks
source link
多头注意力
#2
Open
Walterkd
opened
4 months ago
Walterkd
commented
4 months ago
README里说“DiT Block采用3头注意力”,这里应该是4头注意力吧? train.py 里给的 head=4
README里说“DiT Block采用3头注意力”,这里应该是4头注意力吧? train.py 里给的 head=4