Closed James89045 closed 10 months ago
I wanna ask that why the quantization weight parts in your model have this limit: assert self.wbit <= 8 or self.wbit == 32, if I want to use 16bits to quantize weight, can I change the limit to assert self.wbit <= 32? thank you!
It is okay to change like that, but I recommend using half (float16) tensors for 16bit implementation.
Thank you for your interest!
I wanna ask that why the quantization weight parts in your model have this limit: assert self.wbit <= 8 or self.wbit == 32, if I want to use 16bits to quantize weight, can I change the limit to assert self.wbit <= 32? thank you!