google / qkeras

QKeras: a quantization deep learning library for Tensorflow Keras
Apache License 2.0
533 stars 102 forks source link

GPU Inferencing in Qkeras #97

Closed YogaVicky closed 2 years ago

YogaVicky commented 2 years ago

Hi there! I was interested in implementing the Qkeras example for MNIST CNN model as given in the examples section - Link. This examples involves quantizing the weights and activations into INT4 or 4 bits using the quantized_bits(4,0,1) method for Conv kernels and activations. Is there any way to perform GPU inferencing by converting the model into something like a TRT engine? This method is widely used for packages like NVIDIA-QAT,so I suppose there should be a way for Qkeras as well.

Thanks, Yoga