Confusion on the number of up-projection blocks

Thank you for your great work and kind sharing.

I have a question about the number of up-projection blocks. In your paper, you stated that:

The depth basis generator is a stander decoder that takes the output of C6 as input and stacks five up-projection blocks to generate 128 basis depth maps, and each of the basis depth maps is half the resolution of the input image.

Also, you stated that:

C1 to C6 are layers with {1,2,4,8,16,32} strides

If the number of up-projection blocks is five, the up-projection may perform like this: s=32 -> block 1 -> s=16 -> block 2 -> s=8 -> block 3 -> s=4 -> block 4 -> s= 2 -> block 5 -> s=1

Since the final stride went back to 1 after five blocks, how could those basis depth maps have half the resolution of the input image? I am sorry that I have not found the answer from your code. I will appreciate it if you could share your valuable time and help me understand this problem.

frobelbest / BANet

Confusion on the number of up-projection blocks #8