Closed huayecaibcc closed 2 months ago
Sorry, I checked the code again and found that the use of the two guidance are in x-flux codes. The true_gs parameter does not exist in the flux code of BFL. I'll leave this question for people who have the same doubts. In the BFL code, this guidance is actually CFG, but when distilling the model, it is turned into an embedding to learn the result of the teacher model adjusting the CFG parameters. Therefore, after the distillation is completed, that is, in the inference of flux-dev, it is not necessary to use CFG inferencing twice (conditional and unconditional) to get the result, thereby speeding up the entire inferencing process. At the same time, the x-flux code changed true_gs from 4 to 1 in one submission, which should be the reason.
issue closed
Hi @huayecaibcc, I am also confused about guidance_vec. Why can it achieve the effect of distillation when it is added? I see it only adds to the timestep.
I read the code about flux and found that there are two guidance parameters during model inference, one is
guidance_vec
and the other istrue_gs
.true_gs is used for denoising, which is the famous CFG, which I understand. But
guidance_vec
, calledguidance
in the model forward function, seems to control the time step embedding. My question is what is the role of this guidance. I don’t seem to find a clear reference, and it’s hard for me to understand how this parameter works during training. If anyone can answer, I’d be grateful!