Open asdcaszc opened 2 months ago
Hi, can you share the logs? the stats?
@pierrefdz
The following code is my train log in the first stage, which only trains the watermark encoder_decoder.
Averaged train stats: loss_w: 0.001087 (0.003130) loss_i: 0.056029 (0.055853) loss: 0.001087 (0.003130) psnr_avg: 25.439396 (25.453824) lr: 0.000063 (0.000062) bit_acc_avg: 0.999349 (0.998982) word_acc_avg: 0.968750 (0.960388) norm_avg: 131.814392 (130.939567)
The following code is the evaluation log in the second stage, which finetunes the Stable Diffusion Decoder.
Averaged eval stats: iteration: 7.000000 (7.500000) psnr: 27.567417 (27.551602) bit_acc_none: 0.996419 (0.996265) word_acc_none: 0.828125 (0.840820) bit_acc_crop_01: 0.897461 (0.897453) word_acc_crop_01: 0.000000 (0.009766) bit_acc_crop_05: 0.926432 (0.925977) word_acc_crop_05: 0.025000 (0.027930) bit_acc_rot_25: 0.918945 (0.920203) word_acc_rot_25: 0.015625 (0.017188) bit_acc_rot_90: 0.752604 (0.753377) word_acc_rot_90: 0.000000 (0.000000) bit_acc_resize_03: 0.614909 (0.615145) word_acc_resize_03: 0.000000 (0.000000) bit_acc_resize_07: 0.930990 (0.931938) word_acc_resize_07: 0.062500 (0.056641) bit_acc_brightness_1p5: 0.976888 (0.977311) word_acc_brightness_1p5: 0.437500 (0.430664) bit_acc_brightness_2: 0.930339 (0.930847) word_acc_brightness_2: 0.062500 (0.068945) bit_acc_jpeg_80: 0.826497 (0.831376) word_acc_jpeg_80: 0.000000 (0.000000) bit_acc_jpeg_50: 0.734375 (0.737142) word_acc_jpeg_50: 0.000000 (0.000000)
Why does the evaluation perform so terribly in the second stage, even if the first-stage neural network result is very similar?
These are approximately the stats that are expected. For instance, crop 01 is a strong augmentation, obtaining a bit accuracy of ≈0.9 is already very good (and enough for detection, since 90% of bit accuracy very rarely happens by chance). The JPEG compression is a bit different though. If i remember correctly in the paper we were at ≈0.85 of bit acc for JPEG 50. Could you check that the stats for JPEG 50 in the first stage are at ≈1.00 of bit accuracy?
@pierrefdz
Yes, the bit accuracy on JPEG ≈1.00. The following is the test training in the first stage when I use deeper network and more distortions in training. However, my bit accuracy on JPEG in the second stage is so bad and I don't know the reason.
Averaged eval stats: loss_w: 0.000067 (0.000135) loss_i: 0.056830 (0.056814) loss: 0.000067 (0.000135) psnr_avg: 25.373438 (25.377700) bit_acc_avg: 1.000000 (0.999977) word_acc_avg: 1.000000 (0.999043) norm_avg: 134.984619 (135.126015) bit_acc_none: 1.000000 (0.999977) bit_acc_crop_01: 0.997070 (0.996766) bit_acc_crop_05: 1.000000 (0.999932) bit_acc_resize_05: 1.000000 (0.999917) bit_acc_rot_25: 1.000000 (0.999937) bit_acc_rot_90: 1.000000 (0.999912) bit_acc_blur: 1.000000 (0.999911) bit_acc_jpeg_50: 0.990885 (0.990176)
Hi,
When I trained a new model through hidden/main.py, the bit accuracy and robustness were similar to yours. Why can't my LDM decoder perform as well as yours, when I finetune watermarked LDM decoder by finetune_ldm_decoder.py?