mchancan / flynet

Official PyTorch implementation of paper "A Hybrid Compact Neural Architecture for Visual Place Recognition" by M. Chancán (RA-L & ICRA 2020) https://doi.org/10.1109/LRA.2020.2967324
https://mchancan.github.io/projects/FlyNet
MIT License
45 stars 13 forks source link

About the CANN code #4

Open uditsharma29 opened 3 years ago

uditsharma29 commented 3 years ago

Hello,

I really liked the architecture built by you and was trying to simulate the experiments. Can you please clarify what exactly the code for CANN is achieving? Is it for training or inference?

Thanks

mchancan commented 3 years ago

Hello - thanks for your comment, I am glad to hear that.

Sure, the CANN is basically reducing the false positives rates by filtering/preserving only the best matches along the diagonal of the difference matrix; obtained from FlyNet. And yes, it works during the inference stage only because of the preassigned nature of their parameters.

uditsharma29 commented 3 years ago

Hello,

Thank you so much for adding the code for CANN. Did bring a lot of clarity in understanding the working. I had a small question when going deeper into the code: Can you please quickly clarify what the variables Iapp_nn, Iapp and r and rinit are? In the absence of comments, it is a bit difficult to understand what those variables are doing.

Sorry for bothering you.

Thanks in advance, Udit Sharma

mchancan commented 3 years ago

Hi Udit,

Those are good questions. As far as I remember, Iapp_nn is the FlyNet output: full difference matrix between query and reference image sequences, Iapp is used to extract a single row from Iapp_nn to then propagate it through CANN over time. r is used as an encoded signal representing movement through the environment and rinit its initial state.

You can check out the supplemental material of this paper, from where our CANN implementation was inspired for further details:

Supplementary material:

There are sereval examples that can help to better understand the dynamics of a CANN.

Regards

Rick0514 commented 3 years ago

Hello. Your result is really impressive. I've read your paper and code and have one or two questions about CANN. I wonder if CANN has such a funtion-- it keep the memory of indices of previous frames and give a reasonable index of current frame which avoids deviating too much from previous indices. In other words, is it because indices of most of the frames are along the diagonal that CANN can filter out matches that deviate too much from the diagonal. If so, how this method detect loop closure if it happens.

Sorry to bother you. Thanks, Rick.

mchancan commented 3 years ago

Hi @Rick0514 - thanks for your interest in this work!

That's a great question on the behaviors of the model. Indeed, one of the advantages of CANN is that it keeps track of the current frame based on motion information. In the experiments we used space and time synchronised datasets, which result on similarity matrices that are diagonal. But CANN also works on asynchronous datasets, e.g., reference and query image sequences recorded with different velocities of the car or variable frame rates; much as demonstrated on the RatSLAM work. Re. loop closures, we didn't implement that feature for this paper but it should be fairly easy to do so given the visual module of the whole system.

Note that we have recently made available new research that might be of interest. It highlights some key limitations of this work and CANNs in general and propose a fully-trainable neural network for this task.

Regards

lanjinraol commented 2 years ago

Which parameters should I modify if I want to apply cann to the case where the number of samples is 100 and the number of categories is 10?