Closed listener17 closed 7 months ago
Hello!
Depthwise convolutions not only reduced the parameter count, but also helped me optimize the model. After replacing most of the convolutions with DW, I could use higher learning rates without the model collapsing. Surprisingly, the perceived quality was higher!
Injecting noise into the decoder improved my models. I believe it's helpful to introduce some stochasticity
Hi!
Interesting work. But I don't understand the following changes: