Closed JunityZhan closed 4 months ago
@JunityZhan are you sure? researchers are telling me it doesn't perform well at all with low precision.
@JunityZhan are you sure? researchers are telling me it doesn't perform well at all with low precision.
In training, I think it is better to train it with fp32. However, in inference, I change it to fp16, and the precision of fp16 does not affect anything on quantized result. For e.g. you have 0.53244378278482 and 0.5324437 in fp32 and 16(just an example), you use FSQ, they will round to exactly the same number.
btw, it fix #145
@JunityZhan ok, if what you are saying is true, then i can think about it
but your fix will break for your own condition In training, I think it is better to train it with fp32.
correct?
@lucidrains I am not sure I understand what you said about my fix will break for my own condition. If I dont make that change, I cannot inference it with fp16. The change is just to make sure we can choose whatever precision we like. No matter you want to train it with fp16, fp32, inference it with fp16, fp32, they are allowed. If I don't make that change, I cant inference it with fp16. BTW, even in my test it is better to train it with fp32, it is still posible some tasks we can use fp16, bf16 or other precision, not to mention the need to inference it with fp16.
@JunityZhan right right. the difference is whether i always enforce f32 during training so that people (non researchers) have a greater chance of success using the library, or allow for flexibility. let me think about it
@JunityZhan do you want to see if setting this False
works for you?
@JunityZhan do you want to see if setting this
False
works for you?
I think you only make modification on looks up quantization, but not vq and fsq
@JunityZhan ahh yes, you wanted to do FSQ, let me apply that as well
@JunityZhan ok, try it now for FSQ and if it works out, i'll do the same strat for vq
@JunityZhan ok, Marco reports it is working, closing
Hi, I notice it is not support other preicsion. I made this tiny change and I tried a simple sample and it works on fp16 and bf16. I noticed that there is a x = x.float(), I just comment it. I don't know if it is necessary. In my experiment, it just works fine, we can choose any precision into it.