Download the YouTube-VOS dataset from their website. Please note that our code is trained and tested only on YouTube-VOS dataset for 2018 version. There is a newer version released 2019 but it is not tested.
We recommend to symlink the path to the youtube dataset to datasets/ as follows
cd datasets
ln -s path/to/youtubeVOS youtubeVOS
The files structure should look like:
DMM/datasets
├── youtubeVOS
│ ├── train
│ │ ├── JPEGImages
│ │ │ ├── ...
│ │ ├── Annotations
│ │ │ ├── ...
│ ├── valid
│ │ ├── JPEGImages
│ │ │ ├── ...
│ │ ├── Annotations
│ │ │ ├── ...
│ ├── train_testdev_ot (optional)
│ │ ├── JPEGImages
│ │ │ ├── ...
│ │ ├── Annotations
│ │ │ ├── ...
the train_testdev_ot
data can be downloaded from link
mkdir -p experiments/proposals/
cd experiments/proposals/
wget https://www.cs.toronto.edu/~xiaohui/dmm/proposals/proposals_ytb_train.tar.gz
tar xzf proposals_ytb_train.tar.gz
To train the DMMnet on youtubeVOS train-train split, need to prepare 1. proposals for both train-train and train-val split extracted by coco pretrained X101 Mask R-CNN model
proposals can be downloaded:
mkdir -p experiments/proposals/
cd experiments/proposals/
wget http://www.cs.toronto.edu/~xiaohui/dmm/proposals/feature_coco81.tar.gz
tar xzf feature_coco81.tar.gz
preprocess the proposals for training DMM:
python src/tools/reduce_pth_size_by_videos.py experiments/proposals/coco81/inference/youtubevos_train3k_meta/predictions.pth train 50
python src/tools/reduce_pth_size_by_videos.py experiments/proposals/coco81/inference/youtubevos_val200_meta/predictions.pth trainval 50
python src/tools/reduce_pth_size_by_videos.py experiments/proposals/coco81/inference/youtubevos_testdev_online_meta/predictions.pth train_testdev_ot 90
The files structure should look like:
DMM/experiments
├── propnet
│ ├── join_ytb_bin
│ │ ├── model_0172500.pth
│ ├── online_ytb
│ │ ├── model_0225000.pth
├── dmmnet
│ ├── ytb_255_50_matchloss_epo13
│ │ ├── epo13_iter01640
│ ├── ytb_255_50
│ │ ├── epo08_iter01640
│ ├── online_ytb
│ │ ├── epo101
├── proposals
│ ├── coco81
│ │ ├── inference
│ │ │ ├── youtubevos_train3k_meta (optional)
│ │ │ ├── youtubevos_val200_meta
│ │ │ ├── youtubevos_testdev_online_meta (optional)
│ ├── ytb_train
│ │ ├── inference
│ │ │ ├── youtubevos_val200_meta
│ ├── ytb_ot
│ │ ├── inference
│ │ │ ├── youtubevos_testdev_meta
sh scripts/train/train_101.sh
# or scripts/train/train_50.sh # for resnet 50 mode
Train DMMnet on the first frame of validation set,
first download the preprocessed data used for online training from here, extract the data and put/link the extracted folder as /PATH/TO/datasets/youtubeVOS/train_testdev_ot
prepare proposal, check the Section: Prepare proposals - for training
get the DMMnet trained on train-train set for 1 epoch from here and put it under experiments/dmmnet/
start online training
sh scripts/train/train_online.sh # it takes ~0.17h for one epoch
Evaluate DMMnet on trainval split:
cd ./experiments/dmmnet/
wget http://www.cs.toronto.edu/~xiaohui/dmm/models/dmmnet_ytb_255_50_matchloss_epo13.tar.gz
tar xzf dmmnet_ytb_255_50_matchloss_epo13.tar.gz
wget http://www.cs.toronto.edu/~xiaohui/dmm/models/dmmnet_ytb_255_50.tar.gz
tar xzf dmmnet_ytb_255_50.tar.gz
cd ../../
cd ./experiments/proposals/ wget http://www.cs.toronto.edu/~xiaohui/dmm/proposals/proposals_ytb_train.tar.gz tar xzf proposals_ytb_train.tar.gz cd ../../
- run `sh scripts/eval/eval_r50.sh`
- compute the J and F score by `sh scripts/metric/full_eval.sh /PATH/TO/OUTPUT/merged/`
expected results:
Method | J_mean | J_recall | J_decay | F_mean | F_recall | F_decay |
---|---|---|---|---|---|---|
ytb_R50_w_match_loss_epo13 model: ytb_255_50_matchloss_epo13 | 0.611 | 0.702 | 0.104 | 0.747 | 0.824 | 0.111 |
ytb_R50_wo_match_loss_epo08 model: ytb_255_50 | 0.6 | 0.684 | 0.104 | 0.742 | 0.819 | 0.109 |
cd ./experiments/proposals/
wget http://www.cs.toronto.edu/~xiaohui/dmm/proposals/proposals_ytb_ot.tar.gz
tar xzf proposals_ytb_ot.tar.gz
cd ../../
cd experiments/dmmnet/
wget http://www.cs.toronto.edu/~xiaohui/dmm/models/dmmnet_online_ytb.tar.gz
tar xzf dmmnet_online_ytb.tar.gz
cd ../../
scripts/eval/eval_testdev.sh
scripts/submit.sh
and submit to the server, expected resules: G mean = 0.579part of the code is from https://github.com/imatge-upc/rvos