A question about DWG - Githubissues

In Sec 3.3 (Cross attentive aggregation), you argue that "As mentioned above, it is highly likely that one single saliency cue (spatial or temporal) is obviously much better than the other. In this case, indiscriminate aggregation of the poor saliency branch will heavily degrade the overall performance even if the other saliency cue performs excellently." However, in your implementation, you first compute the course_pre and then apply the non-linear cross thresholding, which seems to be not reasonable.

TJUMMG / DS-Net

A question about DWG #1