Closed zhangyongshun closed 2 weeks ago
hi, I find that in the NextDiT model, the learn_sigma is set to True as default, which will double the out_channels and then return the half of them at the end of foward. How does this help to training. Is there any document for it?
hi, I find that in the NextDiT model, the learn_sigma is set to True as default, which will double the out_channels and then return the half of them at the end of foward. How does this help to training. Is there any document for it?