I have a question about the number of up-projection blocks.
In your paper, you stated that:
The depth basis generator is a stander decoder that takes the output of C6 as input and stacks five up-projection blocks to generate 128 basis depth maps, and each of the basis depth maps is half the resolution of the input image.
Also, you stated that:
C1 to C6 are layers with {1,2,4,8,16,32} strides
If the number of up-projection blocks is five, the up-projection may perform like this:
s=32 -> block 1 -> s=16 -> block 2 -> s=8 -> block 3 -> s=4 -> block 4 -> s= 2 -> block 5 -> s=1
Since the final stride went back to 1 after five blocks, how could those basis depth maps have half the resolution of the input image?
I am sorry that I have not found the answer from your code.
I will appreciate it if you could share your valuable time and help me understand this problem.
Thank you for your great work and kind sharing.
I have a question about the number of up-projection blocks. In your paper, you stated that:
Also, you stated that:
If the number of up-projection blocks is five, the up-projection may perform like this: s=32 -> block 1 -> s=16 -> block 2 -> s=8 -> block 3 -> s=4 -> block 4 -> s= 2 -> block 5 -> s=1
Since the final stride went back to 1 after five blocks, how could those basis depth maps have half the resolution of the input image? I am sorry that I have not found the answer from your code. I will appreciate it if you could share your valuable time and help me understand this problem.