Closed lishunde closed 6 years ago
Hi @lishunde, If we use Q0.7 to represent (-0.5,0.5) then we have only have 128 quantized levels between (-0.5,0.5), such that fixed-point value -64 correspond to value -0.5 and 64 correspond to 0.5. To fully utilize the 8-bit range, use Q0.8 to represent (-0.5,0.5), so fixed point -128 correspond to-0.5 and 128 correspond to +0.5 (to be precise 127->0.496). Similarly (-0.25,0.25)->Q0.9, (-0.125,0.125)->Q0.10 and so on. See section 4.4 in the paper for more details.
Closing the issue due to inactivity, please reopen it if you still face the issue.