Orientation of photos (without exif hint) doesn't give Camera Reconstruction

skinkie commented 5 years ago

We have been using a Samsung J3 phone camera resulting in 8MP photo's. Each photo contains EXIF information concerning the orientation of the camera (landscape, portrait). Later we have used in a separate project a Nikon Coolpix S3700 20.1MP camera which did not have EXIF hints regarding the holding of the camera. A mixed dataset of 384 images was submitted to Meshroom which resulted in only 13 camera being reconstructed. Feature Extraction used the SIFT Describer Type with Described Preset high. After manually orientating all images Meshroom all except two images were reconstructed.

I have not seen before a reference before that the camera orientation could not be recovered from a collection of similar (rotated) describers, and it does not make sense to me. But if this is something that should not be overseen, I am happy to add this to the wiki. Additionally I would be interested to know why this cannot be recovered.

fabiencastan commented 5 years ago

I think, it's not related to the change in metadata as we don't read the exif orientation information at all.

The pipeline is quite repeatable, except one sensitive aspect in the SfM pipeline: the selection of the initial image pair. The selection contains randomness in the robust 2-views estimation. So if you have multiple candidates with a similar score, from one run to another, the reconstruction may start in completely different place of the scene and then the results may change a lot. For instance, you can start in one part of the scene with few images correctly connected together but poorly connected to the rest of the scene, which may explain your very different results between 2 runs.

One thing you can do is to make a SfM with minInputTrackLength set to 3 or 4 to keep only the robust matches and then add another SfM with the standard parameters (you can chain multiple SfM, so the second one will try again to localize the cameras not found by the first one).

skinkie commented 5 years ago

@fabiencastan Thanks for your reply. It makes sense that the initialPair selection is very important. I am currently testing the effects on a dataset of 770 images, currently having describer preset low. Using the default setting of minInputTrackLength only 4 images are Reconstructed. I can agree that increasing (up to 5) will give a better matching, where 11 images are Reconstructed. I am trying to reproduce this effect with a normal describer preset.

fabiencastan commented 5 years ago

I would recommend to keep describer preset to normal, and change parameters like minInputTrackLength on the SfM, if the SfM time is really a problem.

skinkie commented 5 years ago

The describer preset for 770 photo's is normal. And the minInputTrackLength has been varied between 2 and 6.

minInputTrackLength=2, 42
minInputTrackLength=3, 43
minInputTrackLength=4, 104
minInputTrackLength=5, 42
minInputTrackLength=6, 24

fabiencastan commented 5 years ago

When you do the SfM with minInputTrackLength=4, have you tried to add another SfM after with the default parameters?

skinkie commented 5 years ago

As in chaining multiple SfM? With Output of the First as Input for the Second?

hargrovecompany commented 5 years ago

Interesting....I'll try mine again, but I'm pretty sure that I had my orientation all over the place and meshroom figured it out.

Skinkie - just of curiosity, did you have a good white balance and color match between the two cameras?

skinkie commented 5 years ago

@hargrovecompany I did not mix camera's. The white balance was set to fixed in the Coolpix. If anyone is interested in the dataset (even GPS annotated) I am happy to contribute.

hargrovecompany commented 5 years ago

ahhhhh....sorry about that. i am jumping through hoops right now trying to get good results from shots from two types of cameras....

On Sat, Jun 1, 2019 at 12:53 AM Stefan de Konink notifications@github.com wrote:

@hargrovecompany https://github.com/hargrovecompany I did not mix camera's. The white balance was set to fixed in the Coolpix. If anyone is interested in the dataset (even GPS annotated) I am happy to contribute.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/alicevision/meshroom/issues/484?email_source=notifications&email_token=AK5IUYHQNLUWCA2B3VTT773PYIFFNA5CNFSM4HRZGK4KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWWZRJY#issuecomment-497916071, or mute the thread https://github.com/notifications/unsubscribe-auth/AK5IUYFNPCNCUUZO5R2MBR3PYIFFNANCNFSM4HRZGK4A .

skinkie commented 5 years ago

@hargrovecompany I have also tried that, the augment images works for me quite well. Last evening we did an extra set of photo's of a more simple model. Same camera' 10x more ISO, the fill up with the augmentation process was very succesful. So I think that @fabiencastan means, get the ballpark of matching right, then try to do a second phase with augmentation. But if it is possible with only the SfM node I am also interested how.

hargrovecompany commented 5 years ago

I appreciate your advice! I just want to clarify...when you say "augment images" are you talking about processing color/white balance? I just set up "argyll" so that I can do automated color correction using a color card and calibration....I'm wondering if that is the type of augmentation you're talking about?

skinkie commented 5 years ago

@hargrovecompany when you drag in images into the gui two options appear. I mean checking out the bottom option. Today we saw again that prerotating appears to still help. Will investigate that further.

skinkie commented 5 years ago

@fabiencastan Our last test is the following. We have added all photos to a new project (1617). Describer Preset normal, minInputTrackLength=4. This results in a reconstruction of 274 images. Chaining that to a normal SfM. Results in an even more poor result.

Original (minInputTrackLength=4):

input images: 1617

cameras calibrated: 274

poses: 274

landmarks: 20547

Chained (minInputTrackLength=2):

input images: 1617

cameras calibrated: 130

poses: 130

landmarks: 25924

Hence I assume that your suggested to chain is not implemented correctly by us, since it does not improve the quality.

Our batch based project seems to have better overall results, if the KPI is the resulting point cloud. But I fail to understand why the results for the batch with all images is poor, since the same image set was used, and the reconstruction itself is pretty much all over the place, it is certainly not a small area being reconstructed.

Result for adding all images at once:

(see bug #493, SfM2 only appears after reloading, and is not complementary)

Result for batch adding images:

skinkie commented 5 years ago

@fabiencastan @natowi can you disable this stale thing. It does not make sense.

alicevision / Meshroom

Orientation of photos (without exif hint) doesn't give Camera Reconstruction #484