limuhit / 360-Image-Compression

11 stars 1 forks source link

Training with VMSE for multiple epochs results in a significant increase in loss #5

Open tianyalei opened 6 months ago

tianyalei commented 6 months ago

Dear author,

Thank you for your excellent work and open-source code. I have a question to ask you. I'm using your MultiProject operation to generate 14 windows, using VMSE as the distortion metric, and training models with compressai. However, at high bitrates, around 0.8bpp, training starts normally, converging steadily with the loss decreasing consistently. But after training for around 200 epochs, suddenly the loss becomes significantly large and remains so. When I train with MSE, everything is normal. Could this be due to MultiProject? Using VMSE seems to result in the loss becoming significantly large after several iterations, and pre-training doesn't seem to work well either

limuhit commented 6 months ago

It could be due to the MultiProject operation. Would clip the gradient by "torch.nn.utils.clip_gradnorm(param,clip)" with the clip in the range of (0.006,0.06) help? I clipped the gradient and did not meet the mentioned issue.

tianyalei commented 6 months ago

Hello, I have tried gradient clipping, pre-training, and adjusting the learning rate, but the final loss still becomes very large. I attempted to train minnen2018 in compressai using MSE normally, but using the MultiProject operation would lead to this issue

limuhit commented 6 months ago

Could you visualize the produced viewport images to check whether the MultiProject operation works well? In addition, make sure the training data is the ERP image with the ratio of w:h = 2:1. By the way, I did not test on minnen2018 but tried on cheng20 where it works.

limuhit commented 6 months ago

Or you could try to modify the code "lic360_operator/MultiProject.py" by changing line 21 to "return outputs[0], None". Then, recompile the project.

tianyalei commented 6 months ago

Hello, thank you very much for your response. Could you please provide details on the learning rate, clip_gradnorm, and epochs settings used for training cheng20? Additionally, could you describe the training process, and if any training tricks are employed?