Closed sarperkilic closed 2 days ago
Could you test the pull request #133?
i will test now but I also encounter this problem when I don't validate.
i change enter the validation code part like this and I am not validating the network after the first iteration if global_step % cfg.val.validation_steps == 0: # or global_step == 1:
in the first iteration, network produces result as expected, and in the second iteration it gives me nan
it works now, thanks @xumingw
Hi,
I started training stage-1.
In the first iteration, everything is fine but after the first iteration,each model generates only nan value. what can be the reason?
this gives me NaN
face_emb = self.imageproj(face_emb)
this gives me NaN self.reference_unet( ref_image_latents, ref_timesteps, encoder_hidden_states=face_emb, return_dict=False, )
in the first iteration, each model works fine