openai / consistency_models

Official repo for consistency models.
MIT License
6.02k stars 409 forks source link

Why use teacher model in Consistency training #53

Closed CellNw closed 5 months ago

CellNw commented 7 months ago

writen In launch.sh section "Consistency training on class-conditional ImageNet-64, and LSUN 256"

mpiexec -n 8 python cm_train.py --training_mode consistency_training ... --teacher_model_path /path/to/edm_bedroom256_ema.pt ...

so,Why use teacher model in Consistency training.In my understanding,Consistency training is training model isolate. is anything wrong? here is Consistency Training (CT) Algorithm in paper image

dome272 commented 7 months ago

It probably was forgotten to take out of the command. Looking at the code, the teacher_model will not be used if the training mode is consistency_training: https://github.com/openai/consistency_models/blob/e32b69ee436d518377db86fb2127a3972d0d8716/cm/train_util.py#L488

Only the model and target model will be passed to the loss calculation https://github.com/openai/consistency_models/blob/e32b69ee436d518377db86fb2127a3972d0d8716/cm/karras_diffusion.py#L106 And I believe target_model is just the ema model of model (?)

CellNw commented 6 months ago

It probably was forgotten to take out of the command. Looking at the code, the teacher_model will not be used if the training mode is consistency_training:

https://github.com/openai/consistency_models/blob/e32b69ee436d518377db86fb2127a3972d0d8716/cm/train_util.py#L488

Only the model and target model will be passed to the loss calculation

https://github.com/openai/consistency_models/blob/e32b69ee436d518377db86fb2127a3972d0d8716/cm/karras_diffusion.py#L106

And I believe target_model is just the ema model of model (?)

Thanks for the reply, I checked the code you mentioned https://github.com/openai/consistency_models/blob/e32b69ee436d518377db86fb2127a3972d0d8716/cm/karras_diffusion.py#L106

The euler_solver and henu_solver will pass the teacher_model. heun_solver: https://github.com/openai/consistency_models/blob/e32b69ee436d518377db86fb2127a3972d0d8716/cm/karras_diffusion.py#L148 euler_solver: https://github.com/openai/consistency_models/blob/e32b69ee436d518377db86fb2127a3972d0d8716/cm/karras_diffusion.py#L168

and those solver will be used to gen 'x_t2'. https://github.com/openai/consistency_models/blob/e32b69ee436d518377db86fb2127a3972d0d8716/cm/karras_diffusion.py#L193-L196

then 'x_t2' will be passed target_model and finally pass to the loss calculation. https://github.com/openai/consistency_models/blob/e32b69ee436d518377db86fb2127a3972d0d8716/cm/karras_diffusion.py#L199

ljw919 commented 3 months ago

It probably was forgotten to take out of the command. Looking at the code, the teacher_model will not be used if the training mode is consistency_training: https://github.com/openai/consistency_models/blob/e32b69ee436d518377db86fb2127a3972d0d8716/cm/train_util.py#L488

Only the model and target model will be passed to the loss calculation https://github.com/openai/consistency_models/blob/e32b69ee436d518377db86fb2127a3972d0d8716/cm/karras_diffusion.py#L106

And I believe target_model is just the ema model of model (?)

Thanks for the reply, I checked the code you mentioned

https://github.com/openai/consistency_models/blob/e32b69ee436d518377db86fb2127a3972d0d8716/cm/karras_diffusion.py#L106

The euler_solver and henu_solver will pass the teacher_model. heun_solver:

https://github.com/openai/consistency_models/blob/e32b69ee436d518377db86fb2127a3972d0d8716/cm/karras_diffusion.py#L148

euler_solver:

https://github.com/openai/consistency_models/blob/e32b69ee436d518377db86fb2127a3972d0d8716/cm/karras_diffusion.py#L168

and those solver will be used to gen 'x_t2'.

https://github.com/openai/consistency_models/blob/e32b69ee436d518377db86fb2127a3972d0d8716/cm/karras_diffusion.py#L193-L196

then 'x_t2' will be passed target_model and finally pass to the loss calculation.

https://github.com/openai/consistency_models/blob/e32b69ee436d518377db86fb2127a3972d0d8716/cm/karras_diffusion.py#L199

Thank u, so does the consistency training mode need the teacher model path ?

ljw919 commented 3 months ago

And I also met the "shape mismatch" in the update ema process of the target model, do u know why?

CellNw commented 3 months ago

Thank u, so does the consistency training mode need the teacher model path ?

no,consistency training mode dont't need the teacher model.

szh404 commented 2 weeks ago

And I also met the "shape mismatch" in the update ema process of the target model, do u know why?

Have you solved the problem?I also met this error.