PeiyanFlying / PackQViT

6 stars 0 forks source link

Any updates on GeLU activation ? #2

Open Abhranta opened 2 months ago

Abhranta commented 2 months ago

The default activation used here is nn.Hardswish. I cannot find any mentions of the GeLU activation function and the quantized integer implementation of the same mention in the paper. Am I missing something here??

PeiyanFlying commented 2 months ago

Basically, for the nn.Hardswish, the consumption on the hardware part is similar as integer-Gelu. However, the accuracy is better than the integer-Gelu. So at this time, we only release the Hardswish version, which is better than Gelu version.