mihaidusmanu / d2-net

D2-Net: A Trainable CNN for Joint Description and Detection of Local Features
Other
761 stars 163 forks source link

Could you advice on the setting for localization on challenging conditions ? #62

Closed GabbySuwichaya closed 4 years ago

GabbySuwichaya commented 4 years ago

Hi @mihaidusmanu, I am so sorry that I have to ask for your help again. As you might remember, I have tried to reproduced results and asked so many questions in #61 .

Here, I have got the results for the tuned multiscales D2-Net on Aachen day and night . The results that I obtained is attached in the following image.

Screenshot from 2020-05-14 16-31-57

However. my localization evaluation is pretty low so I am not sure what I went wrong, and would like to ask for your help on the settings.

Your advice will be very useful for me to know exactly how to appropriately use D2-Net for localization application.

My settings on tuning D2-Net: I use the default settings in the example training file, except the initial model.

  1. Train D2-Net models/d2_tf.pth
  2. Number of Epoch = 10
  3. lr=1e-3
  4. Megadepth dataset

My settings on feature extraction:

  1. Multi-scales
  2. Use the tuned d2_tf.pth (tuned until the 10th epoch)
  3. max_edge=1600
  4. max_sum_edges=2800
  5. no-relu=False (on the other words, I use "relu" activation layer at the end).

Pre-settings on localization:

  1. Use database.db downloaded from visuallocalizationbenchmark/dataset-preparation and copy it to db.db.

  2. Generate empty reconstruction. Here, I have used the function generate_empty_reconstruction() from reconstruction_pipeline.py where the camera parameters are obtained from model-3D/database_intrinsics.txt and model-3D/aachen_cvpr2018_db.nvm (by calling preprocess_reference_model()). Meanwhile, the image file name, image_id and camera_id mapping are obtained from db.db (by calling recover_database_images_and_ids()).

Then, perform the localization.

The following steps are similar visuallocalizationbenchmark/localization.

However, the difference is that I tried to customize the setting to the recommended settings of D2-Net. Nevertheless, I may get some information wrong. So it would be great if you could suggest the right way.

  1. Extract custom features for both query and database images with the above settings.

  2. Import all features and match database images. Execute python3 modify_database_with_custom_features_and_matches.py using database_pairs_to_match.txt from https://github.com/mihaidusmanu/d2-net/issues/61#issuecomment-623332528 as match_list. Here, I also use ratio_matcher() with ratio=0.95 in place of mutual_nn_matcher()**.

  3. Build a database 3D model. Here, I use the aforementioned empty reconstruction and db.db to built the

  4. Perform feature matching between the query images and the retrieved database images. Here I use query_to_database_pairs_to_match_20.txt that you have provided in https://github.com/mihaidusmanu/d2-net/issues/61#issuecomment-623332528 . Also, again I switch from mutual_nn_matcher() to ratio_matcher() with ratio=0.95.

  5. Estimate the camera poses of the query images using colmap image_registrator

colmap image_registrator --database_path data_directory/your_feature_name.db --input_path data_directory/model_your_feature_name/ --output_path data_directory/model_your_feature_name_with_queries/

  1. Generate the results to txt file. I use only the function recover_query_poses() from reconstruction_pipeline.py. Here, I modified such that the variable raw_queries (line 79 reconstruction_pipeline.py ) is the combined list of the query day images and night images --- day_time_queries_with_intrinsics.txt and night_time_queries_with_intrinsics.txt downloaded from Aachen-day-night/queries ).
mihaidusmanu commented 4 years ago

For the Visual Localization results, we do not use the ratio test - you should use a simple mutual_nn_matcher instead. Secondly, you should start by evaluating the released model (https://dsmn.ml/files/d2-net/d2_tf.pth) to double-check your pipeline - this is the one we have used for our submission so the results you get should be comparable to the ones on the public leader-board.

GabbySuwichaya commented 4 years ago

Hi @mihaidusmanu. Thank you very much for the advice and clarification.

I am sorry for taking long. With your suggestion, the performance has actually improved a lot. Here is the result which is about 3-5% less than the showed results on the benchmark results on night time query.
Screenshot from 2020-05-18 00-05-26

I compared it with the off-the-shelf single scale.

Screenshot from 2020-05-18 00-39-09

However, the provided benchmark result does not show the performance on day time queries...

colmap image_registrator --database_path data_directory/your_feature_name.db --input_path data_directory/model_your_feature_name/ --output_path data_directory/model_your_feature_name_with_queries/

mihaidusmanu commented 4 years ago

The results that you should compare to are the following: https://www.visuallocalization.net/details/964/. Do not compare to the night-results only since they use a different evaluation protocol (the one for the CVPR Local Features Challenge where night-time to database pairs are retrieved manually).

I suspect there is an issue when you are triangulating the 3D model, but I can't know for sure without looking at the code. Moreover, I am unsure if the database at https://github.com/tsattler/visuallocalizationbenchmark/tree/master/local_feature_evaluation#dataset-preparation contains the query images with correct intrinsics - that database was designed for the local features challenge.

In a first approximation, the default image_registrator parameters are good enough (you should be within 1-2% of the day results). For the final submission we slightly changed them to allow poses with lower # inliers - I'll try to look for the exact parameters later today.

GabbySuwichaya commented 4 years ago

Thank you very much for your reply and for catching up about the parameter settings.

I would like to clarify my procedure as follows...

I suspect there is an issue when you are triangulating the 3D model, but I can't know for sure without looking at the code.

Do you mean Step 3. in the Visuallocalization benchmark ?

Specifically, after I have extracted features, I called modify_database_with_custom_features_and_matches.py with the following command:

python modify_database_with_custom_features_and_matches.py --dataset_path data_directory/ --colmap_path /local/colmap/build/src/exe --method_name d2net --database_name db.db --image_path images/ --match_list database_pairs_to_match.txt

In modify_database_with_custom_features_and_matches.py, the geometircal verification has been performed in L145

subprocess.call([os.path.join(args.colmap_path, 'colmap'), 'matches_importer', '--database_path', paths.database_path, '--match_list_path', paths.match_list_path, '--match_type', 'pairs'])

So, after calling modify_database_with_custom_features_and_matches.py, I directly execute colmap point_traingulator to build the database 3D model.

colmap point_triangulator --database_path $dataset_path/$method_name.db --image_path $image_dir --input_path $dataset_path/sparse-d2_net-empty/ --output_path $dataset_path/model_$method_name/ --clear_points 1

mihaidusmanu commented 4 years ago

Regarding the database, I am not sure if the day-time query images have the right intrinsics. The database images should be ok.

GabbySuwichaya commented 4 years ago

I see. I will check it.

GabbySuwichaya commented 4 years ago

Hi @mihaidusmanu, I have checked the database.db. It turns out that the intrinsic parameters of database images in database.db are actually different from database_intrinsics.txt . Meanwhile, the query images share the same intrinsic parameters between database.db and the txt files in /queries.

To confirm my investigation, I have wrote a script to check this ... https://github.com/GabbySuwichaya/Modified-localization-with-custom-features/blob/master/CheckAachenDatabase.py To use this file, you can type:

python3 CheckAachenDatabase.py --dataset_path $dataset_path --colmap_path $colmap_path --method_name $method_name

where $dataset_path, $colmap_path, and $method_name are the same paths as the inputs to reconstruction_pipeline.py

And here is the results from my testing.
Screenshot from 2020-05-19 18-28-25

mihaidusmanu commented 4 years ago

Sorry for the delay in answering, I was quite busy lately. I finally had some time to look through your code and I think I found some things that negatively impact performance.

In the Aachen pipeline all intrinsics are fixed to the reference values for both db and query images. See parameters on L240-242 https://github.com/tsattler/visuallocalizationbenchmark/blob/f6a3fc3a3190bc2f15c0768004e589e25b9f9b0c/local_feature_evaluation/reconstruction_pipeline.py#L240 and L254-256 https://github.com/tsattler/visuallocalizationbenchmark/blob/f6a3fc3a3190bc2f15c0768004e589e25b9f9b0c/local_feature_evaluation/reconstruction_pipeline.py#L254. However, you do not use these parameters in your script https://github.com/GabbySuwichaya/Modified-localization-with-custom-features/blob/master/process_CreateDB_Aachen_D2Net_python_preparing.sh.

Regarding the database intrinsics problem, it shouldn't cause too many issues, but I plan on fixing it and releasing a new database once I have some spare time.

mihaidusmanu commented 4 years ago

Closed! Feel free to reopen / create a new issue if you run into any issues!