vislearn / ControlNet-XS

Apache License 2.0
411 stars 12 forks source link

I am going to train this model. and I find the loss is about 0.20+ at the begin of training. #21

Closed shoutOutYangJie closed 6 months ago

shoutOutYangJie commented 6 months ago

I have trained original controlnet, the loss is lower than 0.15 at the begin of training, due to zero conv module. as for controlnet-xs, the model still uses "zero conv" module, but the initial loss is about 0.20. Is it normal? can you introduce your training loss. can you show me, please!

Sipirius commented 6 months ago

The loss can vary depending on how you train. For instance, if you chose to also learn the timestep-embedding by setting the parameter learn_embedding: true, your loss will still start high, regardless of the zero convolutions. In this case, the model might take a few thousand steps to start producing proper results again.

cheers

shoutOutYangJie commented 6 months ago

The loss can vary depending on how you train. For instance, if you chose to also learn the timestep-embedding by setting the parameter learn_embedding: true, your loss will still start high, regardless of the zero convolutions. In this case, the model might take a few thousand steps to start producing proper results again.

cheers

could you give some log for reference?

shoutOutYangJie commented 6 months ago

@Sipirius I have the last question. If control model branch is trained from scratch, using random weight to initialize, why controlnet-xs need "zero conv" block for feature fusion from base branch to control branch. It is not neccssary to use it because it train from scratch. Have you do some experience to test model if no "zero conv" for control branch?

Sipirius commented 6 months ago

It is beneficial to use zero convolutions to not negatively impact the genration capabilities of the network from the start. This way, the controlling model can focus right away on enhancing the generation rather than to not destroy it at first.