Closed yasar-rehman closed 3 years ago
Using a 1D array as an example. Imagine if the input is
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
and we want to split into 4-element patches. We only want the first and the third patch, so the result supposed to be:
[[0, 1, 2, 3],
[8, 9, 10, 11]]
idx_pat
is the index of a single patch. In this case, it would be
idx_pat = [0, 1, 2, 3]
idx_loc
is the index of locations. In this case, it would be
idx_loc = [0, 8]
Therefore, to get the final indices:
idx = idx_loc + idx_pat
You need to use .view()
and .expand
to change the dimension. But this is the basic idea.
In the actual code, it is a lot complicated when you need to consider 2D inputs with multiple channels, and the patches can be overlapping. Getting the index is quite tricky. I spent a lot of time to figure it out. It involved a lot of drawing on paper as well as trials and errors.
Hi Peter,
Thank you for the response and detailed explanation. This clarifies many things.
Cheers!
Hi Peter,
Thank you for sharing the source code which is well written and self-explanatory. However, could you explain the following lines, 278-279? I have pasted it below for your convenience. I know it would output the indices for the patch location corresponding to the original input x. However, could you explain the logic behind the following lines of code?
idx_pat = (c H W).view(C, 1, 1).expand([C, O, O]) + (o W).view(1, O, 1).expand([C, O, O]) + o.view(1, 1, O).expand([C, O, O]) idx_loc = b W H + y W S + x S idx = idx_loc.view(-1, 1, 1, 1).expand([n, C, O, O]) + idx_pat.view(1, C, O, O).expand([n, C, O, O])
Kind regards,