about temporal_loss？ only need 2 frames， Would it be good to calculate the effect between many frames

zhanghongyong123456 commented 2 years ago

I see the basic loss calculation, it only takes two frames，Most of our actual videos are 30fps, so how good is the two-frame calculation? Is it necessary to add multiple consecutive frames for calculation?

daipengwa commented 2 years ago

I agree with you, maybe you can apply some long-term contraints on more frames. In our experiment, using two frames brings us improved temporal consistency.

zhanghongyong123456 commented 2 years ago

I agree with you, maybe you can apply some long-term contraints on more frames. In our experiment, using two frames brings us improved temporal consistency.

I just have a simple idea, I don't fully understand this consistency loss(especially Multi-Scale Region-Level Relation Loss), can you give a general idea of the specific implementation of multi-frame time consistency, thank you very much, like samples have 10 frames,what should i do?

zhanghongyong123456 commented 2 years ago

I agree with you, maybe you can apply some long-term contraints on more frames. In our experiment, using two frames brings us improved temporal consistency.

i debug code , Notice that the code is a little bit different from the paper,

daipengwa commented 2 years ago

First, the motivation of using multi-scale design is because the distance between eye and screen are not fixed (e.g., the screen will cover a small area in your eyes when you stand at a far distance, vice versa).

Second, I think you can choose frames with random steps (t, t+?), or propagte it (t, t+2) = (t, t+1) + (t+1, t+2)? I am not sure

Third, thanks for pointing out this, the lamda should be put at the first part. you can freely change theses hyperparameters.

zhanghongyong123456 commented 2 years ago

ok,For the second point, it's use (t, t+n) or ((t, t+2) = (t, t+1) + (t+1, t+2) i try it,Thanks for your idea

zhanghongyong123456 commented 2 years ago

First, the motivation of using multi-scale design is because the distance between eye and screen are not fixed (e.g., the screen will cover a small area in your eyes when you stand at a far distance, vice versa).

Second, I think you can choose frames with random steps (t, t+?), or propagte it (t, t+2) = (t, t+1) + (t+1, t+2)? I am not sure

Third, thanks for pointing out this, the lamda should be put at the first part. you can freely change theses hyperparameters.

Hi, I would like to get your guidance, thank you very much

for second, i use temporal_loss for video matte, but test result is not good ,This my design config (temporal_loss_mode = 1, weight_t=50)

Is my design correct? First calculate the difference of the images in sequence, add them, and finally perform the L1 loss calculation uniformly <0> mode == 0 ![image](https://user-images.githubusercontent.com/48466610/195967187-d91f16b1-3602-4582-9a3f-79d9285e2a59.png) < 1> mode == 1 ![image](https://user-images.githubusercontent.com/48466610/195967007-88d586f5-858a-4c23-8128-5de3e352578b.png)
for output=alpha ，Is mode 0（basic relation-based loss） better than mode 1（multi-scale relation-based loss）? Because the alpha output is just a black and white image，No need for multiscale image

daipengwa commented 2 years ago

For your design, if the img_count=40, finally, the gt_error=img(39)-img(0). you skip all the frames between (0~39), only keep the long-term change between 0 and 39.
I suggest you begin with the basic one, then add other designs to see if the quality will be improved?

zhanghongyong123456 commented 2 years ago

For your design, if the img_count=40, finally, the gt_error=img(39)-img(0). you skip all the frames between (0~39), only keep the long-term change between 0 and 39.

I suggest you begin with the basic one, then add other designs to see if the quality will be improved?

ok ， sorry i made a mistake， i see, If there are multiple frames of images, it should be like this, right?

onlyinheaven commented 9 months ago

For your design, if the img_count=40, finally, the gt_error=img(39)-img(0). you skip all the frames between (0~39), only keep the long-term change between 0 and 39.

I suggest you begin with the basic one, then add other designs to see if the quality will be improved?

ok ， sorry i made a mistake， i see, If there are multiple frames of images, it should be like this, right?

Your discussion is very interesting. I am currently also experimenting with similar things. I would like to know if you have figured out how to implement temporal loss between multiple images in the end.

CVMI-Lab / VideoDemoireing

about temporal_loss？ only need 2 frames， Would it be good to calculate the effect between many frames #4