Closed Jeff-Fudan closed 4 months ago
Yes, I’ve only trained the model on TikTok dataset. No other data. Can you first set strict=True in your load_stat_dict function and check if the pretrained weights from stage 1 match the model architecture?
For optimizer_state_latest.th I don’t have it cuz I lose access to the Bytedance Server after I left. But let me see if I can do anything.
Hello, what I meant was that the results from the first stage are inconsistent. I followed your code exactly, with a batch size of 64, but the first stage of training results in many artifacts. However, the effects are quite good when using the model_state-10000.th weights you provided.
Does the training log output image contains artifacts as well? I've never come across such an issue running from my end. Are you sure the env is correct and while setting strict=True in load_stat_dict, there's no error for loading weights?
Does the training log output image contains artifacts as well? I've never come across such an issue running from my end. Are you sure the env is correct and while setting strict=True in load_stat_dict, there's no error for loading weights?
I've set up my environment according to your 'env' configuration. The only difference is that I'm using PyTorch-2.1. I haven't made any changes to your code, and I'm using the TikTok data you provided. However, the results I'm getting are significantly different from those using your weights. May I kindly ask if these weights were produced using this exact set of code? I appreciate your patience and guidance.
I'm 100% absolutely sure the code is correct since I can reproduce from my end. Please try again with my env provided. And confirm again while setting strict=True in load_stat_dict, is there no error for loading weights? Because if strict=False it won't give u error even if the weights is wrong and it will initialize with default weight of Stable diffusion. Please make sure there's no error while strict=True.
I'm 100% absolutely sure the code is correct since I can reproduce from my end. Please try again with my env provided. And confirm again while setting strict=True in load_stat_dict, is there no error for loading weights? Because if strict=False it won't give u error even if the weights is wrong and it will initialize with default weight of Stable diffusion. Please make sure there's no error while strict=True.
Hello, thank you for your affirmative response. What I meant was that the results from the first stage don't match up. The 'setting strict=True in load_stat_dict' you mentioned should be something to pay attention to in the second stage, right? In the first stage, we're just loading the weights of the controlNet, aren't we? Would the version of Torch really have such a significant impact on the results? It seems like this is the only difference left.
No, this argument is also used in loading weights during inference. Just set it to True and see if there’s error. We aren’t using ControlNet for appearance control pretraning it’s weight copy of entire SD-UNet. ControlNet is for stage 2 only.
No, this argument is also used in loading weights during inference. Just set it to True and see if there’s error. We aren’t using ControlNet for appearance control pretraning it’s weight copy of entire SD-UNet. ControlNet is for stage 2 only.
Alright, I'm very interested in your work! It's a fantastic project. However, the weights I trained should be the same as yours, using the same inference script and loading method. But the weights I trained have many artifacts, while the weights you provided are excellent. Could we possibly connect on WeChat? I would be very grateful! My WeChat ID is: iHenrenzhenderen.
Issue solved :)
Yes, I’ve only trained the model on TikTok dataset. No other data. Can you first set strict=True in your load_stat_dict function and check if the pretrained weights from stage 1 match the model architecture?
For optimizer_state_latest.th I don’t have it cuz I lose access to the Bytedance Server after I left. But let me see if I can do anything.