zju3dv / DetectorFreeSfM

Code for "Detector-Free Structure from Motion", CVPR 2024
Apache License 2.0
525 stars 23 forks source link

Is additional input required besides the image? #40

Open Bin-ze opened 3 weeks ago

Bin-ze commented 3 weeks ago

I read this paper, it's great work!

I have some small questions:

  1. When using LoFTR in the coarse reconstruction stage, are the weights of the matcher fixed in this step?
  2. Are all the gradients of the custom Transform model trained in the fine-grained reconstruction stage derived from the subsequent BA?
hxy-123 commented 5 days ago

Hi, sorry for the late reply. Q1: Yes, we use the pre-trained LoFTR outdoor model for matching. Q2: No, we separately train the multiview transformer model in the refinement stage on the MegaDepth dataset in a fully supervised manner by minimizing the l2 loss between refined tracks and GT tracks.

Bin-ze commented 4 days ago

I want to understand the relationship between memory and the number of images to be processed. In my scenario, there are usually more than 1,000 images. Can this method be adapted to this scenario?