colmap / l3mhet / 3DGS on a dataset with rotated cameras

I'm testing on a datset that consists of 21 cameras, arranged in a cylinder-shaped layout surrounding the subject. All have the same lens, but different orientations. I.e., some are landscape-oriented, some are portrait-oriented, and in fact, some are portrait and upside-down. The images themselves are all standard landscape, so that they have the same HxW dimensions, but the portrait images appear sideways.

The default run_colmap.py setup fails to produce a good result on this input. I also tried rotating the images so that they are all right-side-up, but the aspect ratios are mixed: some HxW, some WxH. Here are the originals (top), and "corrected" (bottom), for two cameras:

As a test, I ran colmap gui Automatic Reconstruction on the images. The standard landscape set failed badly. The mixed-ratio "corrected" set was better, but still not great (10 out of 21 cameras solved, but many incorrectly). I tried again in Agisoft Metashape, which threw resolution/aspect warnings on the "corrected" set, but solved for 19/21 cameras:

What is the best way to handle input data like this? Thoughts:

Using the original (top) images, is it possible to get colmap (or metashape, etc.) to correctly solve the cameras - that is, giving matrices that specify a 90deg rotation for the portrait cameras? (Possibly by specifying orientation in the EXIF data, or other camera input intrinsics.) And will l3hmet / 3DGS run well with this input?
If I can instead get a camera solve on the "corrected" mixed-ratio (bottom) images, will l3hmet / 3DGS even accept mixed HxW and WxH image input?
I could take the "corrected" images, and embed the portrait-mode ones in a landscape-ratio HxW image with black bars, so that all are HxW aspect ratio. But I'm assuming this would cause incorrect solves for the portrait cams... unless there's a per-camera crop window parameter I can feed in.

Hi, thanks for using our project! Indeed your case looks peculiar. I would suggest first manually orienting all images to be all right-side-up as you mentioned before further processing. As far as I know, both COLMAP and EasyVolcap can handle such mixed aspect ratios correctly.

For COLMAP, since your lens are all the same, another thing to try is to first use all similar-oriented images to extract a shared intrinsic parameter estimation (assuming there are a sufficient number of such images, i.e. all landscapes or all portraits) and then tell COLMAP to use these intrinsic as input (manually flip for different orientations). Another software to try is RealityCapture, internally we find it more robust and runs faster than the alternatives.

As for the support for l3mhet and 3DGS, EasyVolcap should be able to correctly handle such input since we make no assumption to the input images.

I wouldn't try padding the images with zeros since it might confuse the solvers for COLMAP or other calibration software.

zju3dv / EasyVolcap

colmap / l3mhet / 3DGS on a dataset with rotated cameras #31