CVMI-Lab / VideoDemoireing

(CVPR 2022) Video Demoireing with Relation-Based Temporal Consistency
Apache License 2.0
78 stars 12 forks source link

about temporal_loss? only need 2 frames, Would it be good to calculate the effect between many frames #4

Open zhanghongyong123456 opened 2 years ago

zhanghongyong123456 commented 2 years ago

I see the basic loss calculation, it only takes two frames,Most of our actual videos are 30fps, so how good is the two-frame calculation? Is it necessary to add multiple consecutive frames for calculation? image

daipengwa commented 2 years ago

I agree with you, maybe you can apply some long-term contraints on more frames. In our experiment, using two frames brings us improved temporal consistency.

zhanghongyong123456 commented 2 years ago

I agree with you, maybe you can apply some long-term contraints on more frames. In our experiment, using two frames brings us improved temporal consistency.

I just have a simple idea, I don't fully understand this consistency loss(especially Multi-Scale Region-Level Relation Loss), can you give a general idea of the specific implementation of multi-frame time consistency, thank you very much, like samples have 10 frames,what should i do?

zhanghongyong123456 commented 2 years ago

I agree with you, maybe you can apply some long-term contraints on more frames. In our experiment, using two frames brings us improved temporal consistency.

  1. i debug code , Notice that the code is a little bit different from the paper, image image
daipengwa commented 2 years ago

First, the motivation of using multi-scale design is because the distance between eye and screen are not fixed (e.g., the screen will cover a small area in your eyes when you stand at a far distance, vice versa).

Second, I think you can choose frames with random steps (t, t+?), or propagte it (t, t+2) = (t, t+1) + (t+1, t+2)? I am not sure

Third, thanks for pointing out this, the lamda should be put at the first part. you can freely change theses hyperparameters.

zhanghongyong123456 commented 2 years ago

ok,For the second point, it's use (t, t+n) or ((t, t+2) = (t, t+1) + (t+1, t+2) i try it,Thanks for your idea

zhanghongyong123456 commented 2 years ago

First, the motivation of using multi-scale design is because the distance between eye and screen are not fixed (e.g., the screen will cover a small area in your eyes when you stand at a far distance, vice versa).

Second, I think you can choose frames with random steps (t, t+?), or propagte it (t, t+2) = (t, t+1) + (t+1, t+2)? I am not sure

Third, thanks for pointing out this, the lamda should be put at the first part. you can freely change theses hyperparameters.

Hi, I would like to get your guidance, thank you very much

for second, i use temporal_loss for video matte, but test result is not good ,This my design config (temporal_loss_mode = 1, weight_t=50)

  1. Is my design correct? First calculate the difference of the images in sequence, add them, and finally perform the L1 loss calculation uniformly <0> mode == 0 ![image](https://user-images.githubusercontent.com/48466610/195967187-d91f16b1-3602-4582-9a3f-79d9285e2a59.png) < 1> mode == 1 ![image](https://user-images.githubusercontent.com/48466610/195967007-88d586f5-858a-4c23-8128-5de3e352578b.png)
  2. for output=alpha ,Is mode 0(basic relation-based loss) better than mode 1(multi-scale relation-based loss)? Because the alpha output is just a black and white image,No need for multiscale image
daipengwa commented 2 years ago
  1. For your design, if the img_count=40, finally, the gt_error=img(39)-img(0). you skip all the frames between (0~39), only keep the long-term change between 0 and 39.
  2. I suggest you begin with the basic one, then add other designs to see if the quality will be improved?
zhanghongyong123456 commented 2 years ago
  1. For your design, if the img_count=40, finally, the gt_error=img(39)-img(0). you skip all the frames between (0~39), only keep the long-term change between 0 and 39.
  2. I suggest you begin with the basic one, then add other designs to see if the quality will be improved?

ok , sorry i made a mistake, i see, If there are multiple frames of images, it should be like this, right? image

onlyinheaven commented 9 months ago
  1. For your design, if the img_count=40, finally, the gt_error=img(39)-img(0). you skip all the frames between (0~39), only keep the long-term change between 0 and 39.
  2. I suggest you begin with the basic one, then add other designs to see if the quality will be improved?

ok , sorry i made a mistake, i see, If there are multiple frames of images, it should be like this, right? image

Your discussion is very interesting. I am currently also experimenting with similar things. I would like to know if you have figured out how to implement temporal loss between multiple images in the end.