Open nugabom opened 1 year ago
you seem to work on multiple model compresssion technique such as pruning, quantization, TT? also you are working on deep-learning compiler technique. I have a question about your planning mixed precission inference engine on arm64.
you seem to work on multiple model compresssion technique such as pruning, quantization, TT? also you are working on deep-learning compiler technique. I have a question about your planning mixed precission inference engine on arm64.