Closed lenck closed 7 years ago
Hi Lenc,
It's really nice to hear from you. It sounds great for the new VL Benchmarks. I'm also thinking of having some pretrained models. The problem is how to define the training set.
I'm trying to catch up what you have in mind. Do you plan to have a unified framework for detector and descriptor training by using your HPatches dataset for evaluation? If so, I think there will be great if we can define the training data or at least we should mention that the pretrained model shouldn't use the test data for training. If you can tell me which dataset you want to use as the test set, I will try to not include them as my training data. I'm also thinking that we can define an interface to call a custom detector or descriptor for evaluation.
Another idea in my mind is not only calculating repeatability and matching score, but also including some realistic vision applications for evaluation, such as image retrieval and baseline matching. I believe you also notice that high repeatability and matching score don't really mean good performance in realistic applications. To have diverse applications for evaluation means that we can clearly know that which detector of descriptor can be used in which application.
The new vlbenchmark sounds really interesting to me. Pls let me know if I can contribute to this benchmark.
Best, Xu
On Wed, Sep 6, 2017 at 9:54 AM Karel Lenc notifications@github.com wrote:
Hi, do you plan to release the trained models?
I'm preparing a revised version of the VL Benchmarks and it would be great to include your algorithm as well. However, it seems that training the model from scratch would be quite complicated (e.g. the need to extract the training points etc.). So having access to some pre-trained models would be great :)
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ColumbiaDVMM/Transform_Covariant_Detector/issues/3, or mute the thread https://github.com/notifications/unsubscribe-auth/AFHul5ZiwQ4ylWmbwoDLc8fzDQARmXLaks5sfqP6gaJpZM4POaI4 .
Hi Xu, yeah, you are right it is tricky to define a training set... Currently, I'm just trying to finish a rest which I have for ages - which is to update the VLBenchmarks. It's basically completely rewritten and with much nicer interface, I hope. Main advantage should be that it is possible to use without MATLAB as the features are stored in a simple CSV format and it is possible to call from a command line interface (similarly as the HPatches-benchmark for MATLAB).
Currently I have the old Krystian Mikolajczyk's benchmarks implemented for both Detector and descriptor. And yeah, I was thinking that it would be great to have some tests for real world applications. For retrieval there is already quite some work done (Oxford Buildings, Paris buildings, Holidays...) but I thought that it would be great to have something for monocular tracking and for 3D reconstruction... So if you would like to help, I'm definitely up for that :) Even if it would be just a feedback...
Regarding the training and test set - for the HPatches dataset, we do not have strictly defined training and test set, only several predefined splits to support cross-validation. So there is not much help in that :/ But maybe for a start a model trained on one dataset? Just to see if ti works... :P But in general I do not have any plans right now to use it anywhere, just wanted to finish it up :)
Best, Karel
Hi Lenc,
I'm working on training a detector and a descriptor with images from MSCOCO dataset. I will share the trained model and code once I finish that.
For the application, Local feature tracking and 3D reconstruction are two great applications. Oxford, Paris or Holiday datasets may help. However, the problem is that all these datasets don't have the local feature correspondence groundtruth. Even if we can retrieve the correct image, we don't know whether it is because of the true local feature matching or just some random matches that coincidently have high matching score. I'm working on a DARPA project called Medifor which may be interesting to you. The project information and challenge websites are https://www.darpa.mil/program/media-forensics https://www.nist.gov/itl/iad/mig/media-forensics-challenge
Basically, the problem is that given a manipulated images which may contain parts from different images and also several post-processings, can we retrieve the correct donor images (images that contribute to the manipulated image) from a large dataset (Over 1 Million). They have a nice dataset which records how to generate the manipulated images from the original image step-by-step will all the operations and transform matrices. As the result, we can clear know the local correspondences between two images. I thinks this will be very helpful to the VLBenchmark and I already developed some tools for evaluating local detector and descriptor on that dataset. If you are interested in that, we can certainly plan to do something with the dataset.
I think we can collaborate to improve the benchmark. Maybe we can schedule a tele-conf to talk about that in detail.
Best, Xu
Would you have some of the models from the article available? Just based on the sentence: "Both TILDE and our method share the same training images from the Mexico subset of the Webcam dataset." [p. 6824] which defines the train split. And if I understand it correctly, this model was used for the rest of the experiments in the article? Another thing which I wanted to do in the future is to test several detectors on the HPatches-sequences, which is basically a large scale VGG-Affine. So pre-trained models would be really useful as well for that.
The referred dataset definitely looks really interesting. Do you know if the data are going to be available without registration?
And yeah, I agree that probably discussing it in person would be the most effective... Best, Karel
The training patches are already in the code. I also uploaded the model trained from TILDE dataset. I will include the model trained from a more diverse dataset in the future.
Thank you!
Regarding 3D reconstruction, there is recent https://www.cvg.ethz.ch/research/local-feature-evaluation/ benchmark
Hi, do you plan to release the trained models?
I'm preparing a revised version of the VL Benchmarks and it would be great to include your algorithm as well. However, it seems that training the model from scratch would be quite complicated (e.g. the need to extract the training points etc.). So having access to some pre-trained models would be great :)