Open davidz123 opened 3 weeks ago
Hi. Thank you for your interest in our work! Unfortunately, due to JD.com's policies and the use of internal APIs during the training process, we are unable to upload the original training code. However, you may refer to the ReFL training code available at https://github.com/THUDM/ImageReward/blob/main/ImageReward/ReFL.py. Our training approach is broadly similar, with the additional requirement of implementing classifier-free guidance, as detailed in the following link: https://github.com/huggingface/diffusers/blob/c977966502b70f4758c83ee5a855b48398042b03/src/diffusers/pipelines/controlnet/pipeline_controlnet_inpaint.py#L1433:
noise_pred_uncond, noise_pred_text = noise_pred.chunk(2)
noise_pred = noise_pred_uncond + guidance_scale * (noise_pred_text - noise_pred_uncond)
Additionally, you can formulate our $L_{CC}$ just as follows:
l_cc = ((noise_pred_text - noise_pred_uncond) - (teacher_noise_pred_text - teacher_noise_pred_uncond)) ** 2
where it is necessary to maintain a frozen teacher model. If you have any further questions, please feel free to contact me at dzb99@hust.edu.cn.
Great job! I would like to ask if there are any plans to release the training code.