issues
search
owenliang
/
mnist-dits
Diffusion Transformers (DiTs) trained on MNIST dataset
55
stars
11
forks
source link
多头注意力
#2
Open
Walterkd
opened
6 months ago
Walterkd
commented
6 months ago
README里说“DiT Block采用3头注意力”,这里应该是4头注意力吧? train.py 里给的 head=4
README里说“DiT Block采用3头注意力”,这里应该是4头注意力吧? train.py 里给的 head=4