Open yoavnaveh5 opened 5 years ago
@zhangzhensong this is exactly what I did, but I can't reproduce the paper's results! I modified the code in inference.py to save all inference results in a single .npy file (as requested by the evaluation code in SfMLearner). but when I run the script below in SfMLearner repo, I got bad evaluation results: abs_rel, sq_rel, rms, log_rms, d1_all, a1, a2, a3 0.2500, 3.0770, 7.7265, 0.3257, 0.0000, 0.6616, 0.8548, 0.9298
script I run: python kitti_eval/eval_depth.py --kitti_dir=/server/data/egomotion/kitti/kitti_raw/ --pred_file=/server/data/egomotion/kitti/inference/s2d_eigen_their_model/pred_depth.npy --test_file_list /server/data/egomotion/kitti/kitti_raw/test_files_eigen.txt
Found the source of the problem - the inference.py code in this repo sorts the file content (given through --input_list_file parameter). so in this case, it was sorting the 697 Eigen test images. this is unlike the code in SfMLearner! when I removed the sorting, I got the following results with their pretrained model (which are very close to the baseline paper results): abs_rel, sq_rel, rms, log_rms, d1_all, a1, a2, a3 0.1452, 1.1166, 5.377, 0.218, 0.000, 0.812, 0.943, 0.978
I still struggling with the pose evaluation on KITTI. the outputs of this network and SfMLearner (where the evaluation code is supposed to be) are quite different. here you output 2 transformations (each is represented by 6 scalars: 3 for translation and 3 for Euler angles of the rotation). however, in SfMLearner repo, the transformations are represented using quaternions, and there are timestamps as well. can you elaborate on how to evaluate the pose results in order to reproduce the paper results on KITTI odometry sequences? thank you very much!!
I encountered the same problem, how did you adjust evaluate.py? Thank you!
I encountered the same problem, how did you adjust evaluate.py? Thank you!@yoavnaveh5
Hi @xuheyang @yoavnaveh5 , You should find that each line contains data for a single inference call on a triplet of frames (1, 2, 3): the frame ID, followed by (tx, ty, tz, rx, ry, rz) as translation and rotation from 1 -> 2, followed by the same structure for 2 -> 3. Depending on how you generated these triplets (offsets), it is therefore also possible that you have multiple estimates for the same pair of frames, so it would be possible and probably helpful to reconcile them (e.g. by taking their mean), or just selecting one of them.
In case this is helpful, if I remember correctly, tx and tz should correspond to the translation as seen from a bird's-eye-view (tz pointing forward and tx pointing sideways), ty should be up/down, and therefore ry would describe the rotation of the observer as seen from a bird's-eye-view as well.
As @yoavnaveh5 pointed out, these transforms are represented as quaternions in SfMLearner. You can always convert the representation, or just convert the 6-dimensional representation into a relative transformation matrix that can be applied in sequence to an initial starting position. You can e.g. refer to pose_evaluation_utils.py to adopt the evaluation procedure.
Hope this helps! Vincent
System information
Describe the problem
in the project site, the above script is used for evaluation. however there is no evaluate.py in the repository. can you share details on how to evaluate the inference results (and in specific, how to reproduce the paper results on "Eigen split")?
thank you very much for this great project. looking forward to dive into it after I'll finish with the KITTI validation steps.
Yoav