Closed xiaotongtongxue closed 4 months ago
Hi @xiaotongtongxue! Thanks! The shape of the clsmap seems correct. You got (4,2,8,8) because your image size was 256x256px, and not 512x512px as in the paper. Given an input of shape [B,C,H,W], the localization head should produce a heatmap of [B,1,H/2,W/2] and a clsmap of [B,S,H/32,W/32], where S is the number of species. Hope this helps!
Yes, as you say, my image size is 256x256px (4 batches, 2 classes: background and xx), and the shape of the clsmap is (4, 2, 8, 8). However, the shape of the target[1] is (4, 16, 16), which doesn't match with the clsmap (the wrong information is RuntimeError: input and target batch or spatial sizes don't match: target [4, 16, 16], input [4, 2, 8, 8]). The CSV file for training is generated in a standard format (the following pic), so why does this happen? Thank you once again for your help. Your support means a lot to me.
@Alexandre-Delplanque Well! I have successfully run your code for transforming the int(patch_size//16) into the int(patch_size//8) in the PointsToMask function. Although my image has a size of 256 × 256px, I still do not know why I should change 16 to 8? Any help you can offer would be invaluable. Thanks again!
Hi @xiaotongtongxue,
Thanks! The classification head always produces a clsmap 32 times smaller in size than your input. Hence, 512/16 = 256/8 = 32. This is why you needed to change down_ratio = int(patch_size//16)
to down_ratio = int(patch_size//8)
.
It would be better to write down_ratio=32
when you instantiate the PointsToMask
class:
PointsToMask(radius=2, num_classes=2, squeeze=True, down_ratio=32)
I hope this clarifies your concern?
@Alexandre-Delplanque , Thank you, and I get it from your detailed response. This is truly a remarkable job!
My pleasure! I close the issue.
It is a very excellent job! But I get an issue, which is when I input a Tensor (4, 3, 256, 256) into the herdnet, the hearmap is right (4, 1, 128, 128), but the clsmap is wrong (4, 2, 8, 8), the correct result should be (4, 1, 16, 16). What do you think I could do to address it ?