Open kmatzen opened 3 weeks ago
Hi, I do not get the same results as you, so I am not sure what's going on.
python3 visloc.py --model_name DUSt3R_ViTLarge_BaseDecoder_512_dpt --dataset "VislocSevenScenes('/path/to/7-scenes/', subscene='chess', pairsfile='APGeM-LM18_top20', topk=1)" --pnp_mode poselib --reprojection_error_diag_ratio 0.008 --output_dir /path/to/dust3r_7scenes/20_09_24/chess/loc
gives me
VislocSevenScenes('/path/to/7-scenes/', subscene='chess', pairsfile='APGeM-LM18_top20', topk=1): 2000 images - median_pos_error=0.027831130071196423, median_angular_error=0.9597363623122772 - acc@0.1m,1deg=53.050 - acc@0.25m,2deg=89.400 - acc@0.5m,5deg=97.700 - acc@5m,10deg=97.750
Would it be possible for you to share some preprocessed data as per https://github.com/naver/dust3r/tree/main/dust3r_visloc#7scenes? Then I could see if there's a problem with how I followed the preprocessing instructions or if there's a problem with how the model is used.
Could I get some help understanding how table 1 in the paper was computed? I tried to reproduce the results using the given DUSt3R_ViTLarge_BaseDecoder_512_dpt model and a model that I newly trained using the provided code. I started by comparing the two models with visloc.py on the 7-scenes dataset, but the numbers for the provided model don't seem to match what is reported in the paper.
The "given" results were computed with this command as an example.