Depth map size ratios for depth ControlNet

Hello, Congratulations on this research; I have been having a very good time in experimenting with this pipeline. However, I have a question.

I loaded a mesh and created six depth maps to align in a similar 3x2 grid as shown in the paper. Of course, because each rendered depth map will usually differ in size, I ended up adding padding to each depth map to match the size of the largest of the six images. However, this results in an image grid where the size of each depth map varies quite noticeably; please see the below image. The padded areas are in black, and the depth maps from -20 degree elevations are considerably smaller than the depth maps from 30 degree elevations. 0000_cropped_depth_grid

However, the depth maps shown in the Zero123++ paper seem to all be very similar in size with each other. This made me want to know how the sizing of each depth map was determined during training. Were the depth maps simply resized to match the size of the largest depth map image (as opposed to filling the empty space with padding)?

Thanks in advance.

SUDO-AI-3D / zero123plus

Depth map size ratios for depth ControlNet #78