Open TrepangCat opened 1 year ago
I just realized that as well and came here to see if someone already raised the issue. I will probably switch this to (2, 2, 2)
since I expect the (1, 2, 2)
stride to bias the network towards not treating the axes all the same, which is not what I want in a shift-equivariant network like a CNN.
Same here. Is there anyone who knows why SDFusion sets stride to (1, 2, 2) and why it works?
Same here. Is there anyone who knows why SDFusion sets stride to (1, 2, 2) and why it works?
@weitunglin I am 95% sure that the code is simply buggy and this is undesired. There is no reason why they would want to have a non-cube-like latent.
Appreciate your great work! When I try to run your text2shape model, I notice that stride=(1, 2, 2) in Downsample() and Upsample(). Could you tell me more reasons why you chose stride=(1, 2, 2)? Why not stride=(2, 2, 2)?