Open XA23i opened 8 months ago
interesting work, Since some salient parameters have not been binarized, I am curious about the practical speedup in comparison to floating-point models. Do you utilize some GPU kernel to accelerate inference?
interesting work, Since some salient parameters have not been binarized, I am curious about the practical speedup in comparison to floating-point models. Do you utilize some GPU kernel to accelerate inference?