Open TrepangCat opened 10 months ago
I just realized that as well and came here to see if someone already raised the issue. I will probably switch this to (2, 2, 2)
since I expect the (1, 2, 2)
stride to bias the network towards not treating the axes all the same, which is not what I want in a shift-equivariant network like a CNN.
Appreciate your great work! When I try to run your text2shape model, I notice that stride=(1, 2, 2) in Downsample() and Upsample(). Could you tell me more reasons why you chose stride=(1, 2, 2)? Why not stride=(2, 2, 2)?