Closed xyzhang626 closed 5 months ago
@xyzhang626 ah, let us just do the quantization in float32 for now
can you let me know if 1.12.10 works?
@xyzhang626 ah, let us just do the quantization in float32 for now
can you let me know if 1.12.10 works?
@lucidrains I just tried it, 1.12.10 works for bf16.
Just curious, is there any specific reason for using fp32 in the quantization?
@xyzhang626 just being cautious, as in vector quantization it makes a difference. if in a residual setup, probably still matters too depending on how many residual layers
in standalone LFQ, not sure! i'd welcome any experiments showing f16 works fine, in which case i'll remove the restriction
Cool that makes sense. In my experiment, bf16 works well and fp16 seems fine until GAN loss is added.
@xyzhang626 oh interesting, do you mean the adversarial loss from the VQ-GAN VAE setup? bf16 was fine?
bf16, if i'm not mistakened, has lower precision, so have to be cautious for a residual quantization setup
Yes in my VQGAN experiment bf16 is fine, fp16 is not. bf16 has lower precision but larger range compared to fp16. I think most of LLMs today trained with bf16.
I have no touch with residual quantization. It's surely reasonable to keep full precision to be cautious. If the casting to full precision in the quantization part under a half-precision network training does not obviously cost, that's totally a good choice.
@xyzhang626 thanks for clarifying! let me look into disabling autocast but only for f16
Current version under bf16 training will tiger the following mismatched type error in einsum.
Add an explicit cast to solve it.
Actually it should be covered by the autocast mechanism of pytorch. But somehow the autocast does not work as expected. There might exist a more elegant fix to directly trigger the autocast.