Open Bin-ze opened 3 weeks ago
Hi, sorry for the late reply. Q1: Yes, we use the pre-trained LoFTR outdoor model for matching. Q2: No, we separately train the multiview transformer model in the refinement stage on the MegaDepth dataset in a fully supervised manner by minimizing the l2 loss between refined tracks and GT tracks.
I want to understand the relationship between memory and the number of images to be processed. In my scenario, there are usually more than 1,000 images. Can this method be adapted to this scenario?
I read this paper, it's great work!
I have some small questions: