ubc-vision / COTR

Code release for "COTR: Correspondence Transformer for Matching Across Images"(ICCV 2021)
Apache License 2.0
460 stars 58 forks source link

the way queries generate #44

Open balabalaboy opened 1 year ago

balabalaboy commented 1 year ago

Hello, thank you very much for such a good job. I am a beginner and after reading your thesis I have a general understanding of the principles of this project. But there are some details that I am a little confused about. What about the query pairs on the training data randomly generated or labelling before training?How the number of queries is decided. If I want to finetune against my training set, how should I do it?

jiangwei221 commented 1 year ago
  1. Query pair are generated by projecting the dense depth from one camera to another, and then randomly subsampled.
  2. The number of queries can be any number that fits your memory.
  3. Simplest way would be create a dataloader that returns the same data format as the default dataloder, and load the pertained weights then start fine-tuning with lower learning rate.
balabalaboy commented 1 year ago

Thanks for your reply, I'll try.

balabalaboy commented 1 year ago

Hello, I wanted to see how the learning rate decayed during training, but I didn't seem to find it. Does the learning rate remain the same during training? I found that the current early stopping and learning rate adjustment mechanisms are based on the validation loss of several consecutive epochs. COTR does not seem to be verified every epoch, and the verification is performed after a certain number of epochs.

  1. Query pair are generated by projecting the dense depth from one camera to another, and then randomly subsampled.
  2. The number of queries can be any number that fits your memory.
  3. Simplest way would be create a dataloader that returns the same data format as the default dataloder, and load the pertained weights then start fine-tuning with lower learning rate.