3DOM-FBK / deep-image-matching

Multiview matching with deep-learning and hand-crafted local features for COLMAP and other SfM software. Supports high-resolution formats and images with rotations. Both CLI and GUI are supported.
https://3dom-fbk.github.io/deep-image-matching/
BSD 3-Clause "New" or "Revised" License
338 stars 40 forks source link

Make detector-free matchers (loftr, se2-loftr, roma) really working for multi-camera #24

Open franioli opened 9 months ago

franioli commented 9 months ago

All the detector-free matchers (loftr, se2-loftr, roma) work only on image pairs. Therefore, these approaches currentely return matches on each pair of images with multiplicity (track length) of 2. We should think of some approaches to track the same matched feature on all the images and increase robustness of dense reconstruction.

A possible approach is to implement some kind of binning of the features, as Dmytro Mishkin did in https://github.com/ducha-aiki/imc2023-kornia-starter-pack

SamueleBumbaca commented 1 hour ago

In this thread https://github.com/colmap/colmap/issues/2545 they propose hloc/match_dense to make longer tracks. I tried to run it after loftr matching (because hloc/match_dense supports that), but after writing matches.h5, export_to_db output TypeError: Accessing a group is done with bytes or str, not <class 'tuple'>. Reproducible

`from importlib import import_module
from pprint import pprint

from deep_image_matching import logger, timer
from deep_image_matching.config import Config
from deep_image_matching.image_matching import ImageMatching
from deep_image_matching.io.h5_to_db import export_to_colmap
from deep_image_matching.hloc import match_dense 

import yaml

cli_params = {
    "dir": "/path/",
    "pipeline": "loftr",
    "strategy": "matching_lowres",
    "quality": "lowest",
    "tiling": "None",
    "skip_reconstruction": False,
    "force": True,
    "camera_options": "/path/cameras.yaml",
    "openmvg": None,
}
config = Config(cli_params)

imgs_dir = config.general["image_dir"]
output_dir = config.general["output_dir"]
matching_strategy = config.general["matching_strategy"]
extractor = config.extractor["name"]
matcher = config.matcher["name"]

img_matching = ImageMatching(
    imgs_dir=imgs_dir,
    output_dir=output_dir,
    matching_strategy=matching_strategy,
    local_features=extractor,
    matching_method=matcher,
    pair_file=config.general["pair_file"],
    retrieval_option=config.general["retrieval"],
    overlap=config.general["overlap"],
    existing_colmap_model=config.general["db_path"],
    custom_config=config.as_dict(),
)

pair_path = img_matching.generate_pairs()
timer.update("generate_pairs")

if config.general["upright"]:
    img_matching.rotate_upright_images()
    timer.update("rotate_upright_images")

feature_path = img_matching.extract_features()
timer.update("extract_features")

match_path = img_matching.match_pairs(feature_path)
timer.update("matching")

if config.general["upright"]:
    img_matching.rotate_back_features(feature_path)
    timer.update("rotate_back_features")

with open(config.general["camera_options"], "r") as file:
    camera_options = yaml.safe_load(file)

confs = {
    "loftr": {
        "output": "matches-loftr",
        "model": {"name": "loftr", "weights": "outdoor"},
        "preprocessing": {"grayscale": True, "resize_max": 1024, "dfactor": 8},
        "max_error": 1,  # max error for assigned keypoints (in px)
        "cell_size": 1,  # size of quantization patch (max 1 kp/patch)
    },
}

match_dense.match_and_assign(    
    conf = confs["loftr"],
    pairs_path = config.general["pair_file"],
    image_dir = config.general["image_dir"],
    match_path = match_path,  # out
    feature_path_q = feature_path,  # out
    feature_paths_refs = [feature_path],#: Optional[List[Path]] = [],
    max_kps = 8192,#: Optional[int] = 8192,
    overwrite = True
)

camera_options = {
   'general' : {
    "camera_model" : "simple-radial", # ["simple-pinhole", "pinhole", "simple-radial", "opencv"]
    "openmvg_camera_model" : "pinhole_radial_k3",
    "single_camera" : True,
   }
}

database_path = output_dir / "database.db"

export_to_colmap(
    img_dir=imgs_dir,
    feature_path=feature_path,
    match_path=match_path,
    database_path=database_path,
    camera_options=camera_options,

)
timer.update("export_to_colmap")

---------------------------------------------------------------------------

TypeError: Accessing a group is done with bytes or str, not <class 'tuple'>
`

This is due to the fact that match_dense write matches.h5 in a different way than match_pairs. It imply that add_matches fails during matches = group[key_2][()] because the structure of group[key_2] is <KeysViewHDF5 ['keypoints0', 'keypoints1', 'matches0', 'matching_scores0', 'scores']> instead of an array as match_pairs write. 'keypoints0' and 'keypoints1' point to arrays but if we target one of them like this group[key_2]['keypoints0'] instead of group[key_2][()] (modifing export_to_colmap or h5_to_db.py) we observe an uncorrect matching.

lcmrl commented 55 minutes ago

Hi, I forgot to close this issue, now in dev branch it is implemented a similar approach to the one you mentioned to clusterise keypoints. Please see https://github.com/3DOM-FBK/deep-image-matching/blob/dev/main.py

# Run image matching
feature_path, match_path = matcher.run()

# Export in colmap format
database_path = output_dir / "database.db"
dim.io.export_to_colmap(
    img_dir=imgs_dir,
    feature_path=feature_path,
    match_path=match_path,
    database_path=database_path,
    camera_config_path=config.general["camera_options"],
)

if matcher.matching in ["loftr", "se2loftr", "roma"]:
    images = os.listdir(imgs_dir)
    image_format = Path(images[0]).suffix
    LoftrRomaToMultiview(
        input_dir=feature_path.parent,
        output_dir=feature_path.parent,
        image_dir=imgs_dir, 
        img_ext=image_format)

Hope this will solve the problem