alicevision / Meshroom

3D Reconstruction Software
http://alicevision.org
Other
10.84k stars 1.06k forks source link

Manually associating points between image pairs #494

Open ion1 opened 5 years ago

ion1 commented 5 years ago

I would like to reconstruct the relative camera poses from certain photographs but I have trouble making progress with automatic feature extraction and matching.

Hugin has a convenient UI for manually marking common points on pairs of images (with automatic subpixel precision fine-tuning on roughly marked points) but it only does panorama stitching. Is it possible to have something like this with Meshroom? Thank you!

Hugin Control Point UI

skinkie commented 5 years ago

@ion1 https://github.com/alicevision/meshroom/issues/450#issuecomment-485497671 It is already on the wishlist I read.

With respect to Hugin. I hardly would consider that interface convenient. Have you ever tried the 'line based' notation? And wouldn't you expect a tool to rotate the image for you to compare it? ;-)

In my perspective a fast way to orient the images by a human (for example based on points at the horizon) could significantly reduce the search space. If this is done in concert with the FeatureExtraction that might be a very nice method.

ion1 commented 5 years ago

There seem to be some commercial products with this feature. It would be very nice to have it in an open source tool. These videos demonstrate the UI in the products:

skinkie commented 5 years ago

@ion1 the iWitness thing is like Hugin, it is rather uninformed. I would expect from a tool that a single click would already zoom in to the portion of the screen that is relevant. PhotoModeler looks to be some sort of assisted 3d modellerer which is pretty cool conceptually speaking but also drives only on human input. Defining a user input flow, or have a human state that the recovered direction is accurate might reduce the search space, and therefore the candidates that are proposed. I'll try to come up with such input and its benefits.

french-paragon commented 4 years ago

Agisoft photoscan (now metashape) has a landmark feature which is really usefull to manually specify points in images to improve alignement.

natowi commented 4 years ago

Here is a short draft description on the Feature Extraction file format as starting point for implementation: https://gist.github.com/natowi/ad9ce6d9f912e089bba9b69ec91d7115

skinkie commented 4 years ago

I think this might be another good starting point for a GUI approach. https://fspy.io/

natowi commented 4 years ago

@skinkie How so?

skinkie commented 4 years ago

@natowi fspy is based on a single image to find the camera position based on a simple lens, for example by drawing straight lines to infinity. If you compare this with the straight lines approach hugin takes, by pairing points, I would find a multi scene fspy more intuitive to do the job.

natowi commented 4 years ago

insight3d also allows manual matching http://insight3d.sourceforge.net/insight3d_tutorial.pdf

skinkie commented 4 years ago

@natowi I have tried in the last few hours to get insight3d to compile. I now have a working GTK-2 application with a legacy openCV 2.4. Lets see how this works. Sadly issues with locks.

skinkie commented 4 years ago

@natowi with a lot of support of @mikkelee I was able to get a version of Insight3D working with GTK+3 and OpenCV3. Still it has rough edges but I also want to report that this software has kind of the same problem as Meshroom.

If you take a look at this screenshot, you can see that it can show associated image pairs. For example which images are related. As you can see in the preview it has an invalid relationship. Now also this piece of software is not taylored to invalidate automatic found solutions.

image

I think we need to go into a direction where mulitple image can be roughly painted by a colour, something like a simple palette. The software tries to find both a topologic solution (the relationship between images) and a geometric solution (only find a match in from the painted areas, I think this could be totally assisted, hence it could already provide candidates). I think this would be quite similar to adding CCTags.

Using a topologic viewer I would like to mark the relationship between two images as valid or invalid (globally) to prevent the problems with similar patterns, but also be able to invalidate a specific matches. I think the thumbnail approach of Insight3d would be a very fast way to represent it, and then be able to click on the matrix of thumbnails to disable invalid matches.

In addition I heard at a photogrammetry presentation for aerial photos that the Forstner-operator might resolve similar points better for humans, opposed to SIFT features.

I understand that the cool thing of Insight3D would actually be the manual drawing of polygons in 3d. I haven't get gotten the application to work to that point..

julianrendell commented 4 years ago

Pix4d looks to have a nice implementation of this called “manual tie points.” It’s briefly covered in this video at around the 29:30 mark: https://youtu.be/8HuOvf4rKaw

Looks like you can select a detected feature and it brings up a gallery view of all images with that feature. You can then select an image in the gallery, zoom in, and adjust the point. Seems very intuitive.

skinkie commented 4 years ago

Around 33:19 also the concept of polylines, similar to hugin. Allowing for project orientation. In contrast to Insight3D, which works more like tracing blotting paper.

There is an issue with this manual step: what we as humans observe as a good feature (for example a corner or black well) might never became a (SIFT-)feature. A more human point detector is the Förstner Interest Point Operator. https://www.diva-portal.org/smash/get/diva2:825802/FULLTEXT01.pdf

@julianrendell does the software also have a feature to explictly mention that an image is not correlated to another, for example at a repeating pattern?

Assisted organizing is something I would love to see in Meshroom. I think the entire strategy should be rethought from the amount of information available. This could even work asynchronously with other parts of the software. For example a wizzard to mention what kind of model is expected could be run in paralel with feature extraction. Should feature extraction always happen first on a high detail mode, or is it possible to start with a lower number of features, to align the data, reducing the computing time.

julianrendell commented 4 years ago

@skinkie sorry, no idea; I found that video in looking up some hints for doing photogrammetry of rooms; I don't have access to the software. I've mostly only experimented with going around objects; wasn't sure if the algorithm would work "inside out."

My 2c would be to leverage the power of the pipeline: make alternate/augmented feature mapping as a separate app that can generate files in the right places, with the right formats, that the rest of the Meshroom pipeline can pick up. To me that would be a real winner, especially if Meshroom can then drive a render farm (something I'd like to set up in the future.)

Then if the augmentation/alternative matching works well, it could be integrated. But as a separate app there's a lot more freedom to play with the UI, approaches, etc.

I haven't really looked at the Meshroom code, but given how modular the overall system is, perhaps it's not too much work to rip out just the basic UI frame + image gallery into a shell for implementing matching techniques, eg augmented matching, background removal, etc.

skinkie commented 4 years ago

@julianrendell I agree mostly agree. If an "external app" or even a function inside meshroom (meshroom, is just a frontend to the alicevision/openmvg pipeline) could produce data that can be exchanged with the current pipeline (both image pairs and feature pairs are good examples but are now a separate step in the pipelines).

After implementing some "easy" features, I did stumble upon a steep learning curve with pyside2/qml. I think it is all doable, similar to a steep learning curve with EGL, but probably should be separated in an UX design and implementation. Not to drown into implementation details.

julianrendell commented 4 years ago

@skinkie you're miles ahead of me. I can solve problems, but have very little domain knowledge- in either computer vision or python UI. Until we're asked to move the conversation, I think this is probably as good a place as anywhere to bounce ideas around..?

I'm guessing you'd want to run the external app after feature matching but before SFM in the default pipeline.

And that the app might want to tweak the data from the Feature Extraction, Image Matching, and Feature Matching steps, and have access to the original and corrected images from CameraInit.

So step 1 might be to (find the) document(ation) for the contents of these output directories, and the file formats. This is probably obvious for someone who knows where to look in the AliceVision library code.

Step 2, define a basic list of manual/augmented edits we'd like to have, and find an easy one :-)

Step 3, implement some VERY simple test image libraries, some basic CLI scripts that show removing, moving a feature in an image, changing ties, etc

Step 4 could be to copy https://github.com/alicevision/meshroom/tree/develop/meshroom/ui and just get an empty shell running with qmlAlembic and QtOIIO included. It looks like a big chunk of the UI interaction is done via QML (which uses javascript?) Or it might be easier to start from scratch...

My thinking is that laying the groundwork (steps 1-3) can be done by people with not a lot of experience... step 4 may need someone with some QML & python experience. But perhaps only as a mentor, rather than a dev lead.

natowi commented 4 years ago

I thought about this for some time https://github.com/alicevision/meshroom/issues/494#issuecomment-571235873 The two things that require some thoughts are computing descriptors from features and implementing a solution to sync-link all images in the gallery for setting markers. The rest is basic gui design. This feature crosses over to other feature requests so other potential use cases need to be taken into consideration.

But I think it would be a good starting point to put together a draft on how this feature could work.

julianrendell commented 4 years ago

@natowi I missed that gem. Subscribed to the Gist. I'm many, many, many steps behind you ;-).

Seems that step 0, before any of these great Ideas can really get started, is to create some (python) libraries (or interfaces using existing libraries) that allow reading and writing the data structures you've discovered.

If done with simpler code, and comments to disambiguate, it'd be both examples and documentation.

Maybe time to create a meshroom-data-utils repo? Top level directory for each of the (normal) process directories with a readme based on your gist, and src directory to place learning experiments for accessing the data programmatically?

campagnola commented 3 years ago

Has anyone made progress on this? I would love to start working on it, but don't want to duplicate any effort.. I'd especially like to hear from the Meshroom project maintainers whether they are interested in receiving a PR related to this (and if so what features they have in mind).

natowi commented 3 years ago

@campagnola As far as I know nobody has started to work on this so far. Your contribution would be highly welcome. Best talk to @fabiencastan for details.

skinkie commented 3 years ago

@campagnola to try to understand QML I did some basic changes in the interface to have multiple viewers. In my persective there are several things that should happen when tackling this issue.

  1. Test if manual addition of features would work in the first place (what is the mininum number, what angle should it be, this should prevent the most frustrating part: a user entered points, but SfM still does not work, and there is no clue why not)
  2. The I/O of those manual additions, for example a node takes a list of observation pairs
  3. A discussion with @fabiencastan et al about an iterative SfM that would work between image pairs and can asses the quality of matches.
  4. A functional design
  5. An interface design
  6. Understanding the QML stuff
fabiencastan commented 3 years ago

@campagnola Yes, we would be interested to see contributions on this topic. Could you contact me at fabien.castan@mikrosimage.com to make a confcall to discuss about it.

cnlohr commented 2 years ago

Is there any hope of adding this feature? Even if manually? I.e. I could easily put out a file that specifies image IDs and coordinate pairs which I manually add in. This would be hugely helpful for challenging situations.

sidk99 commented 1 year ago

Any updates on this feature? would be greatly appreciated

newillusion commented 1 month ago

Today I try to reconstucting huge complicated building, and meshroom can't connect several part of building photoset one to other. I try change some parameters, but the result stay divided. How i understand, there is no automatic way to combine my pthotosets to one mesh. And there must be tool to manual improving, for examle, by manual link features, or manual place cameras, or something else.

natowi commented 1 month ago

I think this can be the basis for manual matchings: https://github.com/alicevision/AliceVision/pull/1598 At the moment you can use a workaround like this https://github.com/alicevision/Meshroom/discussions/2331