Possible bug in IntSoftmax - Githubissues

kssteven418 / I-BERT

[ICML'21 Oral] I-BERT: Integer-only BERT Quantization

https://arxiv.org/abs/2101.01321

MIT License

226 stars 32 forks source link

Possible bug in IntSoftmax #4

Open bdalal opened 3 years ago

bdalal commented 3 years ago

🐛 Bug

To Reproduce

Steps to reproduce the behavior (always include the command you ran):

Run cmd '....'
See error

Code sample

Expected behavior

Environment

fairseq Version (e.g., 1.0 or master):
PyTorch Version (e.g., 1.0)
OS (e.g., Linux):
How you installed fairseq (pip, source):
Build command you used (if compiling from source):
Python version:
CUDA/cuDNN version:
GPU models and configuration:
Any other relevant information:

Additional context

I've been trying to add the Ibert quantization modules to distilbert and ran into this issue. https://github.com/kssteven418/I-BERT/blob/45cb6da621a8c63e9329c14390b84a6a566bdf49/fairseq/quantization/utils/quant_modules.py#L658 is a float value and is returned as is. I believe that this should be converted to a tensor on the appropriate device before returning something like scaling_factor = torch.tensor([1 / 2 ** self.output_bit], device=exp_int.device).

Please let me know your thoughts on this. Thanks!