Open attesaarela opened 1 year ago
Use nn.GELU for computing GELU. Makes the optimization run a couple percent faster.
Use nn.GELU for computing GELU. Makes the optimization run a couple percent faster.