KAIST-VICLab / FMA-Net

[CVPR 2024 Oral] Official repository of FMA-Net
https://kaist-viclab.github.io/fmanet-site/
MIT License
611 stars 43 forks source link

Colab version + Can't reproduce the results on the custom video #13

Closed DenisSergeevitch closed 2 months ago

DenisSergeevitch commented 4 months ago

Hello, and thank you for sharing FMA-Ne code. I have been waiting for the model for a while, as I personally love to apply ML tools to ancient historical videos.

Colab

I have made this colab with a reduced Vram usage via mixed precision for anyone who wants to try the FMA-Ne.

Issues

The problem that I encountered is that on the default model, results are almost the same blurry as if the frames were resized bicubically:

Here is the demo with a bit of blurry source video after x4 interpolation:

https://github.com/KAIST-VICLab/FMA-Net/assets/2140110/8df656f4-a810-4c6b-b711-7d409075e708

(left side was resized x4 bicubically)

Here is another example with the more damaged video and the processed x4 version. The FMA-Ne model made results more blurry after the processing:

https://github.com/KAIST-VICLab/FMA-Net/assets/2140110/0c9dbf0d-9ec1-4ac9-a58e-4ac983bc5e52

Frame-by-frame comparison: https://imgsli.com/MjY0MTQ3

I believe I am doing something wrong. Can you please point me in the right direction? I think those could be my issues:

1) Should I retrain the model for the "old videos" blur core? 2) The model is made for reducing 'motion blur', and I try to use it for something it was not made (for general deblurring). 3) My config could be wrong:


seed = 1234

[training]
dataset_path = ./dataset/REDS4
save_dir = ./
log_dir = log_dir

gpu = 0
nThreads = 8
batch_size = 8
lr = 0.0002
num_epochs = 400

finetuning = False
need_patch = True
save_train_img = True
patch_size = 256
scale = 4

stage = 2
num_seq = 3

lr_warping_loss_weight = 0.1
hr_warping_loss_weight = 0.1
flow_loss_weight = 0.0001
D_TA_loss_weight = 0.1
R_TA_loss_weight = 0.1
Net_D_weight = 0.1

[network]
in_channels = 3
dim = 90
ds_kernel_size = 20
us_kernel_size = 5
num_RDB = 12
growth_rate = 18
num_dense_layer = 4
num_flow = 9
num_FRMA = 4
num_transformer_block = 2
num_heads = 6
LayerNorm_type = WithBias
ffn_expansion_factor = 2.66
bias = False

[validation]
val_period = 5

[test]
custom_path = /content/frames```

Thank you for your work!
ahmadmustafaanis commented 4 months ago

F

gvonkreisler commented 4 months ago

F...F... doesnt work at all...

GeunhyukYouk commented 4 months ago

Hello,

First of all, thank you for creating such a nice demo!

To identify the issue, we tested the reds4 020 sequence based on the Colab demo you provided. Our code can be found here.

The test compared the VSRDB results for 1) the original image sequence and 2) the image sequence obtained by decompressing a losslessly (qp=0) compressed video using ffmpeg. The results showed that the second approach (left side of the below image) produced significantly blurry and artifact-laden images.

1

This appears to be due to noise introduced during the compression and extraction process with ffmpeg, suggesting that we may need to adjust the ffmpeg options and re-run the experiment.

Furthermore, here are some possible reasons why FMA-Net might not be working well with your "old videos" setting:

  1. FMA-Net has been trained only for motion blur in the REDS dataset, so it may not work well on other types of blur (even with different color types).
  2. Unlike large models trained on millions or more images and videos, FMA-Net has been trained on only a few hundred video sequences. Therefore, FMA-Net is less generalizable compared to large models and only performs well on other fairly trained models.

To ensure good performance on your videos, it seems necessary to retrain or finetune FMA-Net specifically for old videos.

GeunhyukYouk commented 2 months ago

I will close this issue as there has been no further discussion. Please re-open the issue if there are additional comments.