fudan-generative-vision / hallo

Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation
https://fudan-generative-vision.github.io/hallo/
MIT License
6.05k stars 733 forks source link

I got Nan values during the training stage-1 #135

Closed sarperkilic closed 2 days ago

sarperkilic commented 2 days ago

Hi,

I started training stage-1.

In the first iteration, everything is fine but after the first iteration,each model generates only nan value. what can be the reason?

this gives me NaN

face_emb = self.imageproj(face_emb)

this gives me NaN self.reference_unet( ref_image_latents, ref_timesteps, encoder_hidden_states=face_emb, return_dict=False, )

in the first iteration, each model works fine

xumingw commented 2 days ago

Could you test the pull request #133?

sarperkilic commented 2 days ago

i will test now but I also encounter this problem when I don't validate.

i change enter the validation code part like this and I am not validating the network after the first iteration if global_step % cfg.val.validation_steps == 0: # or global_step == 1:

in the first iteration, network produces result as expected, and in the second iteration it gives me nan

sarperkilic commented 2 days ago

it works now, thanks @xumingw