lucidrains / vector-quantize-pytorch

Vector (and Scalar) Quantization, in Pytorch
MIT License
2.66k stars 216 forks source link

Support half precision for VQ and FSQ #144

Closed JunityZhan closed 4 months ago

JunityZhan commented 4 months ago

Hi, I notice it is not support other preicsion. I made this tiny change and I tried a simple sample and it works on fp16 and bf16. I noticed that there is a x = x.float(), I just comment it. I don't know if it is necessary. In my experiment, it just works fine, we can choose any precision into it.

lucidrains commented 4 months ago

@JunityZhan are you sure? researchers are telling me it doesn't perform well at all with low precision.

JunityZhan commented 4 months ago

@JunityZhan are you sure? researchers are telling me it doesn't perform well at all with low precision.

In training, I think it is better to train it with fp32. However, in inference, I change it to fp16, and the precision of fp16 does not affect anything on quantized result. For e.g. you have 0.53244378278482 and 0.5324437 in fp32 and 16(just an example), you use FSQ, they will round to exactly the same number.

JunityZhan commented 4 months ago

btw, it fix #145

lucidrains commented 4 months ago

@JunityZhan ok, if what you are saying is true, then i can think about it

but your fix will break for your own condition In training, I think it is better to train it with fp32. correct?

JunityZhan commented 4 months ago

@lucidrains I am not sure I understand what you said about my fix will break for my own condition. If I dont make that change, I cannot inference it with fp16. The change is just to make sure we can choose whatever precision we like. No matter you want to train it with fp16, fp32, inference it with fp16, fp32, they are allowed. If I don't make that change, I cant inference it with fp16. BTW, even in my test it is better to train it with fp32, it is still posible some tasks we can use fp16, bf16 or other precision, not to mention the need to inference it with fp16.

lucidrains commented 4 months ago

@JunityZhan right right. the difference is whether i always enforce f32 during training so that people (non researchers) have a greater chance of success using the library, or allow for flexibility. let me think about it

lucidrains commented 4 months ago

@JunityZhan do you want to see if setting this False works for you?

JunityZhan commented 4 months ago

@JunityZhan do you want to see if setting this False works for you?

I think you only make modification on looks up quantization, but not vq and fsq

lucidrains commented 4 months ago

@JunityZhan ahh yes, you wanted to do FSQ, let me apply that as well

lucidrains commented 4 months ago

@JunityZhan ok, try it now for FSQ and if it works out, i'll do the same strat for vq

lucidrains commented 4 months ago

@JunityZhan ok, Marco reports it is working, closing