TJUMMG / DS-Net

48 stars 4 forks source link

A question about DWG #1

Open clelouch opened 3 years ago

clelouch commented 3 years ago

Thanks for your code and paper. In your implementation, the DWG is proposed to evaluate the reliability of spatial and temporal features and it assigns different weights to features at different stages. However, in my opinion, it is reasonable that spatial features (or temporal features) at different stages have the same weight since they are extracted from the same image. Have you compared these two strategies?

clelouch commented 3 years ago

In Sec 3.3 (Cross attentive aggregation), you argue that "As mentioned above, it is highly likely that one single saliency cue (spatial or temporal) is obviously much better than the other. In this case, indiscriminate aggregation of the poor saliency branch will heavily degrade the overall performance even if the other saliency cue performs excellently." However, in your implementation, you first compute the course_pre and then apply the non-linear cross thresholding, which seems to be not reasonable.