issues
search
bclarkson-code
/
Tricycle
Autograd to GPT-2 completely from scratch
107
stars
9
forks
source link
19 swiglu implementation is incorrect
#20
Closed
bclarkson-code
closed
7 months ago
bclarkson-code
commented
7 months ago
Added an optionally tunable bias term to swiglu
Added an optionally tunable bias term to swiglu