VinAIResearch / HyperCUT

HyperCUT: Video Sequence from a Single Blurry Image using Unsupervised Ordering (CVPR'23)
GNU Affero General Public License v3.0
12 stars 1 forks source link

Am I using the inference.py the right way? #2

Open HenrySomeCode opened 1 year ago

HenrySomeCode commented 1 year ago

Firstly, I really appreciate how fast Phạm Băng Đăn has replied, also re-write the readme.md and updated an inference code. When I first run the inference command ( maybe this just applies the blur2vid model, e.g., Jin et al., Purohit et al. models and not the whole blur2vid + HyperCUT):

python inference.py --backbone Jin \
                    --target_frames 1 2 3 4 5 6 7 \
                    --pretrained_path path/to/pretrained_Blur2Vid.pth \
                    --blur_path path/to/blurry_image \      

I ran into this error: ModuleNotFoundError: No module named 'models.backbones.jin_et_al' So I download the Jin et al. repo: https://github.com/MeiguangJin/Learning-to-Extract-a-Video-Sequence-from-a-Single-Motion-Blurred-Image/tree/master , place it in models/backbones/jin_et_al. Also, I downloaded the pre-trained Jin et al models: https://www.dropbox.com/sh/r0n9x6uz1ke8iuy/AADJBQBf9E2UMzG4Gt2Az-Qza?dl=0 , put the folder 'models' inside models/backbones/jin_et_al, like this: image

Then I changed the jin_backbone.py a little bit like this: image

After that I test an image and get a 'not good' result, I would say:

Test image: image_4_blurry

Results: deblur_0 deblur_1 deblur_2 deblur_3 deblur_4 deblur_5 deblur_6

This is the the command I used: python inference.py --backbone Jin --target_frames 1 2 3 4 5 6 7 --pretrained_path models/backbones/jin_et_al/models/center_v3.pth --blur_path custom_dataset/image_4_blurry.png

This is another try, this time I used this image: 0054 and this command (with Hand.pth not center_v3.pth): python inference.py --backbone Jin --target_frames 1 2 3 4 5 6 7 --pretrained_path pretrained_models/Hand.pth --blur_path 0054.png

I ran into an error that said: RuntimeError: shape '[1, 1, 112, 4, 112, 4]' is invalid for input of size 202500 So I resized the image from the size of 448x448 to the size of 460x460 since I figured out that somehow the size of image must be divisible by 5 and 4, but still, the bad result:

deblur_0 deblur_1 deblur_2 deblur_3 deblur_4 deblur_5 deblur_6

zero1778 commented 1 year ago

The pretrained model we offer is tailored to the HyperCUT model only. If you intend to employ a different backbone combination, you'll need to retrain the model using your own dataset. It's worth highlighting that the previous method, as explained in the paper, didn't produce satisfactory outcomes, even when concentrating solely on reconstructing data rather than resolving order-ambiguity. To tackle these issues, we've put forward a solution aimed at overcoming these limitations.

Furthermore, if you have any queries regarding the environment or the code, feel free to ask right here. However, if your concerns pertain to quality or other statistical issues, please reach out to me via email.

HenrySomeCode commented 1 year ago

The pretrained model we offer is tailored to the HyperCUT model only. If you intend to employ a different backbone combination, you'll need to retrain the model using your own dataset. It's worth highlighting that the previous method, as explained in the paper, didn't produce satisfactory outcomes, even when concentrating solely on reconstructing data rather than resolving order-ambiguity. To tackle these issues, we've put forward a solution aimed at overcoming these limitations.

Furthermore, if you have any queries regarding the environment or the code, feel free to ask right here. However, if your concerns pertain to quality or other statistical issues, please reach out to me via email.

No, I don't want to use a different backbone combination. Your combination is the Jin et al. model or Purohit model + HyperCUT, am I right? If that's the case, then I want to use your combination. My aim now is just want to convert this blurred image: image to an image sequence/video like the one you showed: image Even though I have used the inference command you suggested, I still get a bad result: deblur_0 deblur_1 deblur_2 deblur_3 deblur_4 deblur_5 deblur_6

I believe I have done something wrong here, so I made this issue to hope that I may get some advice.

P/s: The reason why I re-wrote the the jin_backbone.py from this: image

to this:

image

is because I met the error that said: No module named 'models.backbones.jin_et_al.center_esti_model'