TRI-ML / VOST

Code for the VOST dataset
23 stars 2 forks source link

Seeking help for XMem code #3

Closed z-jiaming closed 1 year ago

z-jiaming commented 1 year ago

Thanks for your nice work!

Could you please share your code that trains the XMem? It would be a great help if you could!

pvtokmakov commented 1 year ago

Hi Jiaming,

unfortunately, I don't have the capacity to properly realize all the baselines, but here's a dump of my local XMem repo: https://drive.google.com/file/d/1IcdQOieXBa2LNtn1WYiMR5fdGC82B2ya/view?usp=sharing. Hope it helps!

z-jiaming commented 1 year ago

This is a great help, thank you so much!

I looked at the code and found that you are loading the training set for VOST using the balanced_train.txt in def load_sub_ours, which is not provided by the original dataset. We would like to ask if you modified the training set when training the XMem_baseline, for example by deleting some videos?

Thanks again for your help!

pvtokmakov commented 1 year ago

Sorry for the confusion. This must be an artifact from when I was constructing the splits. The reported numbers were obtained with the final split which is released with the dataset.

z-jiaming commented 1 year ago

Thanks for your reply!

Sorry, I have one more question. In aot_plus you used 1) merge_sample and 2) filer out those above ignore_thresh (https://github.com/TRI-ML/VOST/blob/398bbc2ee5dcd6eef3a508532f7ac1e2962df601/aot_plus/dataloaders/train_datasets.py#L340), but not in XMem of this code. Did you try them in XMem?

pvtokmakov commented 1 year ago
  1. Sample merging is a part of the AOT implementation. XMem doesn't use it and I was evaluating all the algorithms with minimal modifications.
  2. I thought I added filtering out initial frames with large ignore regions to all the algorithms, but I might have never pushed it to the repo. In any case, adding it should be a good idea for any algorithm you are developing.
z-jiaming commented 1 year ago

Thank you very much for your reply and congratulations again on your work!!!