Closed YoucanBaby closed 6 months ago
xformer should work ( I tested with torch2.0). But 8_bit_adam I don't test
Dear @xiaohu2015 ,
I see there is no such line in your training code: unet.to(accelerator.device, dtype=weight_dtype)
. Why don't you use this line?
When I set unet.to(accelerator.device, dtype=weight_dtype)
, I got the error: ValueError: Attempting to unscale FP16 gradients.
Sincerely, Yifang
Dear authors,
Why not use xformers and 8_bit_adam to speed up training?
Are they causing performance degradation?
Sincerely, Yifang