bo-miao / MAMP

[ICME 2022] Self-Supervised Video Object Segmentation by Motion-Aware Mask Propagation.
BSD 3-Clause "New" or "Revised" License
32 stars 6 forks source link

Unable to reproduce results #2

Closed zfang399 closed 3 years ago

zfang399 commented 3 years ago

Hi @bo-miao thanks for open sourcing the nice work!

When I run your checkpoint with the eval script on DAVIS2017 val set, the performance is indeed 70.4. Then I tried to replicate your results by running train.sh without any modifications, but I was getting an J&F of 67.65 only, with J=0.662 and F=0.691. I'm using the train_all_frames in YouTube-VOS (the valid_all_frames and test_all_frames are not used) to train.

Is this behavior (large variance in model performance, maybe depending on the random seed) normal, or am I setting up the training data incorrectly?

bo-miao commented 3 years ago

Hi,

We only use the training set of YouTube-VOS to train our model, the training data is from train_all_frames. We first resize the images of the whole training data of YouTube-VOS into 256x256 and store them on the server, and then train our model using the resized images.

The snapshot of our environment configuration is shown as follows. Moreover, we found that using the model of 32/33 epochs performs better than that of 35 epochs, and the released checkpoint is the model that trained for 33 epochs.

I hope this helps.

Collecting environment information...
PyTorch version: 1.9.0a0+gitc91c4a0
Is debug build: False
CUDA used to build PyTorch: 11.2
ROCM used to build PyTorch: N/A

OS: Ubuntu 20.04.2 LTS (x86_64)
GCC version: (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Clang version: Could not collect
CMake version: version 3.19.6

Python version: 3.9 (64-bit runtime)
Is CUDA available: True
CUDA runtime version: 11.2.142
GPU models and configuration: GPU 0: GeForce RTX 3090
Nvidia driver version: 460.67
cuDNN version: Probably one of the following:
/usr/local/cuda-11.2/targets/x86_64-linux/lib/libcudnn.so.8
/usr/local/cuda-11.2/targets/x86_64-linux/lib/libcudnn_adv_infer.so.8
/usr/local/cuda-11.2/targets/x86_64-linux/lib/libcudnn_adv_train.so.8
/usr/local/cuda-11.2/targets/x86_64-linux/lib/libcudnn_cnn_infer.so.8
/usr/local/cuda-11.2/targets/x86_64-linux/lib/libcudnn_cnn_train.so.8
/usr/local/cuda-11.2/targets/x86_64-linux/lib/libcudnn_ops_infer.so.8
/usr/local/cuda-11.2/targets/x86_64-linux/lib/libcudnn_ops_train.so.8
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.19.5
[pip3] torch==1.9.0a0+gitc91c4a0
[pip3] torchgeometry==0.1.2
[pip3] torchstat==0.0.7
[pip3] torchvision==0.10.0a0+a64b54a
[conda] mkl                       2020.2                      256    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
[conda] mkl-include               2020.2                      256    https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
[conda] numpy                     1.19.5                   pypi_0    pypi
[conda] torch                     1.9.0a0+gitc91c4a0          pypi_0    pypi
[conda] torchgeometry             0.1.2                    pypi_0    pypi
[conda] torchstat                 0.0.7                    pypi_0    pypi
[conda] torchvision               0.10.0a0+a64b54a          pypi_0    pypi
zfang399 commented 3 years ago

Thanks for your detailed explanation! I'll try this.