wilsonkl / SfM_Init

code for solving global structure from motion problem in the ECCV'14 paper "Robust Global Translations with 1DSfM"
Other
172 stars 60 forks source link

Queries regarding the dataset #13

Open ralphlauran99 opened 4 years ago

ralphlauran99 commented 4 years ago

I have a few questions about the Notredame dataset:

  1. The focallength values in the coords.txt are just extracted exif data from the images, right?
  2. Are the 2d points distortion corrected?
  3. Do the track for the 3d points inside tracks.txt contain outliers? Is the outlier ratio pretty high, that we'd need to filter each track? Is that why the number of points and the length of tracks in gt_bundle.out file don't match?
  4. The global rotation matrices of the output don't seem to match with the ones in gt_bundle.out, and even though the ceres optimization has converged, the final error remains pretty high for Notredame dataset, is this something we are to typically expect?
  5. The final global rotation matrix would directly map 3d coordinates from world frame to the local camera frame, right? and the translation vector points from the local camera frame to the origin of the world frame?

A full Bundle Adjustment after these steps gives horrendous results, If you could answer these questions it would really help me.

Output of the Ceres:

Cost:
Initial                          2.013846e+05
Final                            1.837385e+03
Change                           1.995472e+05

Few outputs from the trans_solution_gt_error.txt.

5.470280486156838817e-01
2.027027982072828749e-01
1.149737710160212423e+00
5.778861875420591154e-01
1.161094644737241888e+00
5.101561324533857578e-01
2.327587745143928355e-01
5.997080014921055691e-01
4.577352512139001295e-01
3.739532625965568680e-01
3.279407561186570064e+00
9.887411027077162018e-01
1.983278381445617411e+00
1.194619452803997728e+00
3.573316004630480158e+00
7.043630585111496645e-01
5.933352117224803823e-01
5.896811433688437631e+01
9.057261663188299616e-01
4.424834776896067190e-01
6.755634175657883045e-01
6.744774426966796410e-01
8.508632871885408733e-01
3.956823466300293801e-01
8.373437578441021989e-01
8.643071159848966234e-01
1.100136722493999075e+00
2.741327609569821355e-01
3.949266778971045278e+05
2.460965933784385884e-01
3.734907405509875766e-01
2.264241431543099647e+00
1.509217718166686817e+00
wilsonkl commented 4 years ago

Hi @ralphlauran99 , Just to clarify, you're talking about the 1DSfM datasets from here, right? Do you mean the Montreal Notre Dame dataset, or the older (Paris) Notre Dame dataset that Noah distributed with Bundler?

There is a difference between the Notre Dame dataset and the others. The Notre Dame dataset is not registered in a metric coordinate system. The others were all georegistered to the globe, and hence are scaled (approximately) in meter units. The Notre Dame dataset solution is not in meter units (or any other well-defined units). It's in an arbitrary coordinate system.

Here are some point-by-point answers:

The focallength values in the coords.txt are just extracted exif data from the images, right?

The focal lengths are estimated by a script from Bundler.

Are the 2d points distortion corrected?

No, the 2D points locations in the coords files come from David Lowe's SIFT binary.

Do the track for the 3d points inside tracks.txt contain outliers? Is the outlier ratio pretty high, that we'd need to filter each track? Is that why the number of points and the length of tracks in gt_bundle.out file don't match?

Oh yes! There are outliers aplenty. The tracks.txt files are generated by the code behind Building Rome in a Day. You can read about the processing that that pipeline does. The tracks files represent a real, in-the-wild problem instance, and it is not very clean.

The global rotation matrices of the output don't seem to match with the ones in gt_bundle.out, and even though the ceres optimization has converged, the final error remains pretty high for Notredame dataset, is this something we are to typically expect?

Are you looking for matching rotations under some sort of gauge-alignment? The solvers here are gauge-free, so the results will be in an arbitrary rotational frame. But let me assume you are adjusting for gauge and still seeing large rotation residuals. Are they large relative to what we report in our paper? That performance should be reproducible, up to some relatively stable randomness. If only translation residuals are large, but not rotation residuals, see the note above about Notre Dame's coordinate system. In this case there may be no real problem.

The final global rotation matrix would directly map 3d coordinates from world frame to the local camera frame, right? and the translation vector points from the local camera frame to the origin of the world frame?

Check the Bundler documentation for the definitive description of coordinate systems.