Open Toan-Do opened 1 year ago
Hi @Guangxuan-Xiao , for Gelu in Bloom model. Do you implement W8A8B8O8LinearGelu kenel for it or implement custom Gelu activation function to deal with 8byte datatype output of W8A8B8O8Linear?
Not sure about how they did it, but this change: https://github.com/Guangxuan-Xiao/torch-int/pull/1/commits/2163a169748edff67586c2bf0158f4c7f0718fc6 includes an implementation for Gelu unit.
Thank you for your great work. I am very interested in Bloom int8 models. Could you please share the code and checkpoints for Int8 Bloom models ?