Closed Liujingxiu23 closed 1 year ago
@Liujingxiu23 this complex valued discriminator is going to be the end of me lol
so i believe the complex network needs to be done at full precision - i think i'll just force an autocast here. will get it done later today
@Liujingxiu23 could you try 0.23.7 and see if that addresses the issue?
@lucidrains Thank you!It works. And my trainning which not use fp16 have trained several days, and the synthesized wave is just nearly to understand.
hmm let's move this to discussion, as the original issue has been solved
the loss=nan using fp16
Seconding training soundstream with fp16 mixed precision as an outstanding issue on 0.25.5. I see real values for all losses except for multi_spectral_recon_loss, which is inf at step 0 and quickly moves to nan.
run: accelerate launch --multi_gpu --mixed_precision=fp16 --gpu_ids=0,1 train.py
error info Input type (CUDAComplexHalfType) and weight type (CUDAComplexFloatType) should be the same
relative code: def forward(self, x): weight, bias = map(torch.view_as_complex, (self.weight, self.bias)) return F.conv2d(x, weight, bias, stride = self.stride, padding = self.padding)