GATECH-EIC / ShiftAddNet

[NeurIPS 2020] ShiftAddNet: A Hardware-Inspired Deep Network
MIT License
68 stars 18 forks source link

round_to_fixed function #5

Closed msxiaojin closed 3 years ago

msxiaojin commented 3 years ago

Hi, thanks for the great work!! And I am very interested in this work.

However, I am new to the area of quantization and have some questions about the round_to_fixed function in deepshift.utils Line7-18.

In line15 the torch.floor(input/delta) round the fp32 input to the nearest 16bit interger. In my opinion the clamp function should then be followed to clamp the nearest intergers to range(min_val, max_val), that is changing line15-17 to the following: _rounded = torch.floor(input/delta) rounded = torch.clamp(rounded, min_val, maxval) rounded = rounded*delta

Could you give me some comments about the difference of these two implementations? Thanks!!

msxiaojin commented 3 years ago

BTW, may I ask what's the range of input? If input is fp32, the range is very large. Is there any implications that input range from [-1,1]?