Open nanmehta opened 3 years ago
Thanks for your interest. Q1: The main difference between our cola-net and "cross scale attention network for super-resolution" is that we propose a collaborative attention mechanism for image restoration. Furthermore, in "cross scale attention network for super-resolution", it utilized Conv and DeConv to construct long-range correlations and feature updating. Q2: I do not fully understand you confusion. In our patch-wise non-local module we calculate the relationship in the unit of feature patches, and the code implementation is consistent with it.
If you have any question, please let me know.
thnks for your quick reply. I am confused, that in your code you have used extract_image_patches also and fold,unfold operation also . You have mentioned that unfold operation is used to extract the sliding local patches from the transformed feature maps, SO WHAT IS THE ROLE OF extract_image_patches FUNCTION THEN?
Extracting patches and unfold are the same thing.
yes sir, then why you have used extract_image_patches in the next line like patch_28, paddings_28 = extract_image_patches(b1, ksizes=[self.ksize, self.ksize], strides=[self.stride_1, self.stride1], rates=[1, 1], padding='same') and similarly for patch 112, patch_112_2. why you haven't used unfold operation over here. Actually sir, I am getting quit confused. does it make any difference, if we used them interchangably.
It is because we want to reduce the computational complexity. The smaller stride is the better for performance, but at the same time it brings a lot of computational complexity. Thus we apply this strategy to make a trade-off.
Yes it will make a difference, if we used them interchangably. It changes the number of query and its long-range correlations.
okay sir, thanks a lot. sir can you please help me, where in the code have you written the code for heat maps
sir, thanku for releasing the code of this wonderful paper. sir, can you please explain me the difference between cola-net and cross scale attention network for super-resolution(https://openaccess.thecvf.com/content_CVPR_2020/papers/Mei_Image_Super-Resolution_With_Cross-Scale_Non-Local_Attention_and_Exhaustive_Self-Exemplars_Mining_CVPR_2020_paper.pdf). I understand this much that they are utilizing some down-sampling operation too for performing the correlation operations, and in your code you are utilising the fold and unfold operations. I have one more doubt, you said "Rather than directly reshaping Q, K and calculating the relationship in pixel level as [21], we propose to utilize the unfold operation to extract sliding local patches from these transformed feature maps with stride s and patch-size of Wp × Hp. " bUT ,sir in code i have seen that you hav used simple extract_patches function to extract the patches form Q,K and V. please help me out!