cvg / Hierarchical-Localization

Visual localization made easy with hloc
Apache License 2.0
3.11k stars 580 forks source link

Some questions about you provided retrieval results #63

Closed xxlxsyhl closed 3 years ago

xxlxsyhl commented 3 years ago

I was unable to reproduce the retrieval results https://github.com/cvg/Hierarchical-Localization/blob/master/pairs/aachen/pairs-query-netvlad50.txt using the model provided at https://github.com/uzh-rpg/netvlad_tf_open. Are you using your own trained NetVLAD model?

sarlinpe commented 3 years ago

I used this exact same repository with the model trained on pitts30k. I used the following script, with some additional changes to the netvlad_tf lib to support TensorFlow v1.

import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()

import cv2
from pathlib import Path
import h5py
from tqdm import tqdm
import numpy as np

import netvlad_tf.nets as nets

def inference(root, output_file, resize_max=1024):
    paths = []
    globs = ['*.jpg', '*.png', '*.jpeg', '*.JPG', '*.PNG']
    for glob in globs:
        paths += list(Path(root).glob('**/'+glob))
    print(f'Found {len(paths)} images')
    assert len(paths) > 0

    tf.reset_default_graph()

    image_batch = tf.placeholder(
        dtype=tf.float32, shape=[None, None, None, 3])
    net_out = nets.vgg16NetvladPca(image_batch)
    saver = tf.train.Saver()
    gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.5)
    sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))
    ckpt = nets.defaultCheckpoint('vd16_pitts30k_conv5_3_vlad_preL2_intra_white')
    saver.restore(sess, ckpt)

    hfile = h5py.File(str(output_file), 'a')

    for path in tqdm(paths):
        img = cv2.imread(str(path))

        h, w = img.shape[:2]
        if max(h, w) > resize_max:
            scale = resize_max / max(h, w)
            h_new, w_new = int(round(h*scale)), int(round(w*scale))
            img = cv2.resize(img, (w_new, h_new), interpolation=cv2.INTER_LINEAR)

        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
        desc = sess.run(net_out, feed_dict={image_batch: img[None]})[0]
        desc = desc.astype(np.float32)

        name = path.relative_to(root)
        grp = hfile.create_group(str(name))
        grp.create_dataset('global_descriptor', data=desc)

    hfile.close()

if __name__ == '__main__':
    root = '/path/to/aachen/images/images_upright'
    output_file = 'aachen-v1.1_tf-netvlad.h5'
    inference(root, output_file)

I plan to port this NetVLAD model to PyTorch and integrate it into hloc, but this will not happen too soon - any contribution is welcome :)

yopi1838 commented 3 years ago

Hi. I've been following this issue. Just to confirm my understanding, once we obtained the h5 file from this NetVLAD model, we just have to run the image pairs using the pairs_from_retrieval.py script, right?

sarlinpe commented 3 years ago

That is correct.

xxlxsyhl commented 3 years ago

The problem is solved. Thank you!