Googolxx / STF

Pytorch implementation of the paper "The Devil Is in the Details: Window-based Attention for Image Compression".
Apache License 2.0
161 stars 20 forks source link

Getting weird results with STF, need help ! #6

Closed jjeremy40 closed 2 years ago

jjeremy40 commented 2 years ago

Hi ! Thank you for the great work !

I'm having some difficulties to reproduce the results on other datasets though.... And I could really use your help !

I trained STF (transformer version) on Waymo & BDD100K opensource datasets (imgs from front cam of an autonomous vehicle, 180000 imgs in total), for the almost same lambda values [0.0009, 0.0018, 0.0035, 0.0067, 0.013, 0.025, 0.0483] using MSE Loss for 200 epochs each. Using Weight&Bias app, the training results looks fine (see graph below). Don't they ?

Capture d’écran 2022-09-25 à 15 43 58

And when I test my new weights on Waymo I get weird results... Indeed : (see the graph below) ...

I have done those test multiple times and I always get the same strange results...

Capture d’écran 2022-09-25 à 15 32 13

Do you have any idea where the bug can come from ?? Is it the training datasets ? Why the cropping change the results ?

I would be extremely grateful for your help !!!

Googolxx commented 2 years ago

Hi ! The loss curves look fine during training, but get weird results on 1920x1280 images while testing, especially at low bpp. I have not tried on specific data domain like Waymo or BDD100K. But I guess intuitively there is a domain gap between train(256x256) and test(1920x1280 in Waymo), as all the images are from a fixed angle. How about trying to resize the original images to a low resolution while training ?And have you ever tested on other datasets?

Googolxx commented 2 years ago

Supplementally, the training data is randomly cropped to 256x256, and the val data is centrally cropped to 256x256.

jjeremy40 commented 2 years ago

Thanks for answering !

I have also tested on Kodak dataset (but there's only 25 images...) :

By "trying to resize the original images to a low resolution while training" you mean resize Waymo&BDD100K images to, lets say 480x320 for Waymo & 640x360 for BDD100K, instead of RandomCrop(256) while training ?

Googolxx commented 2 years ago

Such as apply the transforms like "train_transforms = transforms.Compose( [transforms.Resize([960, 640]), transforms.RandomCrop(256), transforms.ToTensor()])" instead of https://github.com/Googolxx/STF/blob/0e2480434542af1ac9566e903c57ca0d9d9bf954/train.py#L277-L279

jjeremy40 commented 2 years ago

Ok I'll try that. Thanks !

ld-xy commented 2 years ago

can you try transfer onnx? thinks