Open AssafSinger94 opened 8 months ago
For each TAP-Vid DAVIS video I apply the following:
python main_processing.py --data_dir ../tapvid_davis/processed_256/$i/ --chain
(after completing all necessary preprocessing instructions).train.py --config configs/default.txt --data_dir ./tapvid_davis/processed_256/$i/ --save_dir ./tapvid_davis/processed_256/$i/ --num_iters 200000
Thinking of this situation, I wanted to ask , would it be possible for you to provide the pre-trained weights for TAP-Vid DAVIS? I think that would be the optimal solution, and much simpler than retraining the model and making sure everything works perfectly. It would be deeply appreciated.
I would like to ask how to evaluate the result, there is no eval code in the repo, neither the guidance of evaluation. Would you like to provide the eval code? I am reproducing the code now.
I mean eval the metric of OA, AJ, etc
Thank you for your questions.
This folder contains a script for evaluation (eval_tapvid_davis.py) and the pre-trained weights which you can use to reproduce the exact result in the paper.
To run the evaluation:
nn.Linear(input_dims + input_dims * ll * 2, proj_dims), nn.ReLU(), nn.Linear(proj_dims, proj_dims)
.python eval_tapvid_davis.py
. If the evaluation runs successfully, you should get this output which matches the number in the paper:
30 | average_jaccard: 0.51746 | average_pts_within_thresh: 0.67490 | occlusion_acc: 0.85346 | temporal_coherence: 0.74060
Regarding the hyperparameters: yes we used a different set of hyperparameters for the tap-vid evaluation (but they were the same across all tap-vid videos). The reason is that tap-vid videos have much lower resolutions (256x256), and we found RAFT performance downgrades and relying more on the photometric information by upweighing its loss helps improve the performance. I hope this is helpful for you at least for now. Please allow me some time to integrate and organize things into the codebase and release more details.
Hello! Given this issue and #42 , here are some changes that should be applied to the default config to reproduce training+evaluation on TAP-Vid dataset from the omnimotion paper:
args.min_depth = -0.5
and args.max_depth = 0.5
args.use_affine = False
args.num_iters = 200000
nn.Linear(input_dims + input_dims * ll * 2, proj_dims), nn.ReLU(), nn.Linear(proj_dims, proj_dims).
Is it correct? Are there any other changes required for the quantitative results reproduction?
Relying more on the photometric information by upweighing its loss helps improve the performance.
So, in your TAP-Vid training photometric loss weight was increasing from 0 to 10 over the first 50k steps and then staying fixed at 10, or was some other schedule applied?
Would you mind sharing the full config file used for the results in the paper?
Hello! In the annotations folder, I can see that each video sequence corresponds to a pkl file, I would like to ask, how did this file get it?There is no such file in the training results, and I did not find the module that generated this file in the code.
The paper mentions these two methods:
But what kind of changes can be made to the published code to get another method, or can you publish the code for the other method?
你好! 在annotations文件夹中可以看到每个视频序列都对应一个pkl文件,我想问一下,这个文件是怎么得到的?训练结果中没有这个文件,我也没有找到该模块在代码中生成此文件。
你好,请问你找到了吗?方便说一下这个对应的pkl文件是怎么得到的吗?
Hello, I am trying to reproduce OmniMotion results on TAP-Vid DAVIS. I preprocessed and trained the models using the default configs (except for using
num_iters=200_000
). However, when evaluating the trained models I am gettingd_avg=63.5%
, which is lower compared to 67.5% outlined in the paper. (Further elaboration of my training & evaluation process is described below).Therefore I wanted to ask, do the default hyperparameters and configurations in the repo match reported model? Also, I wanted to ask whether you have any code for evaluating OmniMotion on Tap-Vid? I had to write some code on my own (which I verified and fairly trust), but still I think that using your evaluation pipeline would still be reliable. :)
Thank you in advance! Assaf