How to avoid resulting in multiple models?

linhanwang commented 1 year ago

I'm using pixsfm on a large dataset with 1900 images. I have a coarse images poses so I use pairs_from_poses. I tried pixsfm but got multiple small models. Do you have any tips to get a whole model? Maybe I use larger num_matches?

This is my code. It's from https://github.com/cvg/pixel-perfect-sfm/issues/23.

feature_conf = extract_features.confs['superpoint_max']
    matcher_conf = match_features.confs['superglue']

    images_path = Path(hparams.images_path)
    model_path = Path(hparams.model_path)

    images = sorted(images_path.iterdir())
    references = [str(images[i].relative_to(images_path)) for i in range(0, len(images), hparams.use_every)]
    print(len(references), 'mapping images')

    features_path = output_path / 'features.h5'
    sfm_pairs_path = output_path / 'pairs-sfm.txt'
    matches_path = output_path / 'matches.h5'

    extract_features.main(feature_conf, images_path, image_list=references, feature_path=features_path)
    # pairs_from_gps.main(sfm_pairs_path, images_path, references, 100, 4)
    pairs_from_poses.main(model_path, sfm_pairs_path, 100)
    print('Begin matching featues')
    sfm = PixSfM(conf={"dense_features": {"use_cache": True},
                       'KA': {'dense_features': {'use_cache': True}, 'max_kps_per_problem': 1000, "strategy": "topological_reference"},
                       'BA': {'strategy': 'costmaps'}})
    match_features.main(matcher_conf, sfm_pairs_path, features=features_path, matches=matches_path)

    print('Begin sfm')

    ref_dir = output_path / 'ref'
    refined, sfm_outputs = sfm.reconstruction(ref_dir, images_path, sfm_pairs_path, features_path, matches_path,
                                              image_list=references, verbose=True)

paidiakileswar commented 4 months ago

Can you tell me how you used the geo positions in Pixsfm , I dont see any pairs_from_gps in hloc and pixsfm Repo, If ok, Can you share me the required code to work with this,, pairs_from_gps.main(sfm_pairs_path, images_path, references, 100, 4)this !

It will be helpful to my research , Best Regards

@linhanwang @linhanwang

linhanwang commented 4 months ago

Can you tell me how you used the geo positions in Pixsfm , I dont see any pairs_from_gps in hloc and pixsfm Repo, If ok, Can you share me the required code to work with this,, pairs_from_gps.main(sfm_pairs_path, images_path, references, 100, 4)this !

It will be helpful to my research , Best Regards

@linhanwang @linhanwang

This is the code I used. This code is from @hturki https://github.com/cmusatyalab/mega-nerf. I don't remember where I found it exactly.

import argparse
import math
import xml.etree.ElementTree as ET
from datetime import datetime
from pathlib import Path
from typing import List, Tuple

import numpy as np
import pymap3d as pm
import scipy.spatial
import torch
from PIL import Image
from transformations import transformations

from . import logger
from .pairs_from_retrieval import pairs_from_score_matrix
from dateutil import parser

def get_gps_pos(image_path: Path) -> Tuple[float, float, float, np.ndarray, datetime]:
    img = Image.open(image_path)
    found = False
    for segment, content in img.applist:
        marker, body = content.split(b'\x00', 1)
        if segment == 'APP1' and marker == b'http://ns.adobe.com/xap/1.0/':
            root = ET.fromstring(body)

            anafi_lat = root[0][0].find('{http://ns.adobe.com/exif/1.0/}GPSLatitude')

            if anafi_lat is not None:
                lat = anafi_lat.text
                split = lat.split(',')
                lat = float(split[0]) + float(split[1][:-1]) / 60
                if split[1][-1] == 'S':
                    lat *= -1

                long = root[0][0].find('{http://ns.adobe.com/exif/1.0/}GPSLongitude').text
                split = long.split(',')
                long = float(split[0]) + float(split[1][:-1]) / 60
                if split[1][-1] == 'W':
                    long *= -1

                alt = root[0][0].find('{http://ns.adobe.com/exif/1.0/}GPSAltitude').text
                if '/' in alt:
                    split = alt.split('/')
                    alt = float(split[0]) / float(split[1])
                else:
                    alt = float(alt)

                roll = float(root[0][0].find('{http://www.parrot.com/drone-parrot/1.0/}CameraRollDegree').text)
                pitch = float(root[0][0].find('{http://www.parrot.com/drone-parrot/1.0/}CameraPitchDegree').text)
                yaw = float(root[0][0].find('{http://www.parrot.com/drone-parrot/1.0/}CameraYawDegree').text)

                orientation = transformations.euler_matrix(roll * math.pi / 180, pitch * math.pi / 180,
                                                           yaw * math.pi / 180)[:3, :3]

                date = parser.parse(root[0][0].find('{http://ns.adobe.com/exif/1.0/}DateTimeOriginal').text)
            else:
                lat = float(root[0][0].get('{http://www.dji.com/drone-dji/1.0/}GpsLatitude'))
                long = float(root[0][0].get('{http://www.dji.com/drone-dji/1.0/}GpsLongtitude'))
                alt = float(root[0][0].get('{http://www.dji.com/drone-dji/1.0/}AbsoluteAltitude'))

                flight_roll = float(root[0][0].get('{http://www.dji.com/drone-dji/1.0/}FlightRollDegree'))
                flight_pitch = float(root[0][0].get('{http://www.dji.com/drone-dji/1.0/}FlightPitchDegree'))
                flight_yaw = float(root[0][0].get('{http://www.dji.com/drone-dji/1.0/}FlightYawDegree'))

                gimbal_roll = float(root[0][0].get('{http://www.dji.com/drone-dji/1.0/}GimbalRollDegree'))
                gimbal_pitch = float(root[0][0].get('{http://www.dji.com/drone-dji/1.0/}GimbalPitchDegree'))
                gimbal_yaw = float(root[0][0].get('{http://www.dji.com/drone-dji/1.0/}GimbalYawDegree'))

                orientation = (transformations.euler_matrix(flight_roll * math.pi / 180,
                                                            flight_pitch * math.pi / 180,
                                                            flight_yaw * math.pi / 180)
                               @ transformations.euler_matrix(gimbal_roll * math.pi / 180,
                                                              gimbal_pitch * math.pi / 180,
                                                              gimbal_yaw * math.pi / 180))[:3, :3]

                date = parser.parse(img._getexif()[36867])

            found = True
            break
    if not found:
        raise Exception('Did not find metadata for {}'.format(image_path))

    return lat, long, alt, orientation, date

def main(output,
         image_dir: Path,
         image_list: List[str],
         closest_geo: int,
         closest_time: int):
    Rs = []
    ts = []
    dates = []
    ref_lat = None
    ref_long = None
    ref_alt = None
    for image_id in image_list:
        lat, long, alt, R, date = get_gps_pos(image_dir / image_id)
        if ref_lat is None:
            ref_lat = lat
            ref_long = long
            ref_alt = alt

        Rs.append(torch.FloatTensor(R).unsqueeze(0))
        ts.append(torch.FloatTensor(pm.geodetic2ned(lat, long, alt, ref_lat, ref_long, ref_alt)).unsqueeze(0))
        dates.append(date.timestamp())

    logger.info(f'Obtaining pairwise distances between {len(image_list)} images...')

    Rs = torch.cat(Rs)
    ts = torch.cat(ts)
    dates = torch.FloatTensor(dates)

    pos_dist = torch.cdist(ts, ts)
    date_dist = torch.cdist(dates.unsqueeze(-1), dates.unsqueeze(-1))

    pairs = []
    for i in range(len(image_list)):
        _, closest_pos = torch.topk(pos_dist[i], closest_geo + 1, largest=False)
        _, closest_date = torch.topk(date_dist[i], closest_time + 1, largest=False)
        for j in torch.cat((closest_pos, closest_date)).unique():
            if i == j:
                continue
            pairs.append((image_list[i], image_list[j]))

    logger.info(f'Found {len(pairs)} pairs.')
    with open(output, 'w') as f:
        f.write('\n'.join(' '.join(p) for p in pairs))

if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument('--model', required=True, type=Path)
    parser.add_argument('--output', required=True, type=Path)
    parser.add_argument('--closest_geo', required=True, type=int)
    parser.add_argument('--closest_date', required=True, type=int)

    args = parser.parse_args()
    main(**args.__dict__)

paidiakileswar commented 4 months ago

om_poses.main(model_path, sfm_pairs_path, 100)

@linhanwang

Thanks for the help ,

Here i considered

pairs_from_gps pairs_from_exhaustive

with ( Superpoint_max + SuperGlue Model)

and using same params that you use (low memory configs)

    sfm = PixSfM(conf={"dense_features": {"use_cache": True},
                       'KA': {'dense_features': {'use_cache': True}, 'max_kps_per_problem': 1000, "strategy": "topological_reference"},
                       'BA': {'strategy': 'costmaps'}})

Here I am testing different datasets with 1200 - 1300 images (video with 4fps)

But i am getting results not as much expected (like multiple sparses , missing regions , missing rooms like that)

Could you tell me how i can overcome this, Any suggestions,methods could be given by you will be more helpful

PLease help it would be helpful to my research things

paidiakileswar commented 4 months ago

@linhanwang .. can you please help , i m stuck here ! Please help !!

linhanwang commented 4 months ago

@linhanwang .. can you please help , i m stuck here ! Please help !!

I am sorry. But I don't know how to solve your problem. Actually, I raised this issue here because I also encountered the same problem.

paidiakileswar commented 4 months ago

@linhanwang .. can you please help , i m stuck here ! Please help !!

I am sorry. But I don't know how to solve your problem. Actually, I raised this issue here because I also encountered the same problem.

Okay @linhanwang , As you closed the issue , I thought you found a solution to overcome these !!

cvg / pixel-perfect-sfm

How to avoid resulting in multiple models? #125