deep-floyd / IF

Other
7.63k stars 495 forks source link

Why stage-I-x/xl model use GELU instead of SiLU in stage-I-M model? #117

Open Jiashi-Li opened 1 year ago

Jiashi-Li commented 1 year ago

I am curious about this detail, can anyone explain this?