question about paper - Githubissues

Feynman1999 commented 1 year ago

Hello, very interesting job. I have a question: for compression simulator g, if we remove it and replace g^{-1} with a regular neural network to de-compress, will the entire model still work properly? (lambda_3 and lambda_5 loss removed correspondingly)

yang-jin-hai commented 1 year ago

Thank you for your interest. If $g$ is removed and $g^{-1}$ is replaced with a regular neural network, I think the entire model will still work properly. But the performance may reduces if the compression restoration network is not specificly designed.

In fact, $g^{-1}$ in our work can also be seen as a bi-directionally optimized compression restoration network that we also do its inverse $(g^{-1})^{-1}=g$ to simulator the compression distoration and apply corresponding loss functions. Since bi-directionally optimized INN is better than mono-directionally optimized SR networks, I believe $g$ will work better than usual networks.

Besides, in this work we don't force the distribution of the HF (high-frequency) split output of $f$ (and also the HF split input of $f^{-1}$), this components is generated by $g^{-1}$ with the sampled GMM signal and compressed image. If $g^{-1}$ is replaced with regular network, the HF component may need some further guidance.

Feynman1999 commented 1 year ago

I understand better, thank you. Additionally, the paper states that $\lambda5$ * $L{rel}$ can use other compression methods, such as WebP. But during training, differentiable jpeg compression was used, wouldn't this cause conflicts between g and g-1?

yang-jin-hai commented 1 year ago

Yes it will cause conflict. Using differentiable jpeg for training WebP model is sub-optimal, but it's hard to implement each compression algorithm differentiable. Some reasons for what our model achieve satisfying performance on WebP:

WebP is somehow similar to JPEG. In our experiments on some other compression formats, I found the model performs well at the beginning, but gradually drops. But on WebP the performance is as steady as JPEG.
DIFFJPEG is also not exactly the same to real JPEG, it works more like a "gradient bridge". In some works, the authors may directly set the gradient=1 for non-diffirentiable operations, and it also works well.

Feynman1999 commented 1 year ago

thank you for your reply ! it help me a lot,

yang-jin-hai / SAIN

question about paper #3