hkchengrex / XMem

[ECCV 2022] XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model
https://hkchengrex.com/XMem/
MIT License
1.76k stars 192 forks source link

xmem demo fails with single object mode set to True #100

Closed monajalal closed 1 year ago

monajalal commented 1 year ago
(xmem) mona@ard-gpu-01:/hdd/code/segmentation/XMem$ rg "Single object mode"
model/network.py
27:        print(f'Single object mode: {self.single_object}')
(xmem) mona@ard-gpu-01:/hdd/code/segmentation/XMem$ vi model/network.py 
(xmem) mona@ard-gpu-01:/hdd/code/segmentation/XMem$ python  interactive_demo.py 
Hyperparameters read from the model weights: C^k=64, C^v=512, C^h=64
Single object mode: True
Traceback (most recent call last):
  File "/hdd/code/segmentation/XMem/interactive_demo.py", line 74, in <module>
    network = XMem(config, args.model).cuda().eval()
  File "/hdd/code/segmentation/XMem/model/network.py", line 38, in __init__
    self.load_weights(model_weights, init_as_zero_if_needed=True)
  File "/hdd/code/segmentation/XMem/model/network.py", line 198, in load_weights
    self.load_state_dict(src_dict)
  File "/home/mona/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1667, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for XMem:
    size mismatch for value_encoder.conv1.weight: copying a param with shape torch.Size([64, 5, 7, 7]) from checkpoint, the shape in current model is torch.Size([64, 4, 7, 7]).
(xmem) mona@ard-gpu-01:/hdd/code/segmentation/XMem$ python
Python 3.10.11 (main, May 16 2023, 00:28:57) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch 
>>> torch.__version__
'1.13.0+cu117'
>>> import torchvision
>>> torchvision.__version__
'0.14.0+cu117'
$ uname -a
Linux ard-gpu-01 5.19.0-45-generic #46~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Wed Jun 7 15:06:04 UTC 20 x86_64 x86_64 x86_64 GNU/Linux
$ uname -a
Linux ard-gpu-01 5.19.0-45-generic #46~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Wed Jun 7 15:06:04 UTC 20 x86_64 x86_64 x86_64 GNU/Linux
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Jun__8_16:49:14_PDT_2022
Cuda compilation tools, release 11.7, V11.7.99
Build cuda_11.7.r11.7/compiler.31442593_0
$ nvidia-smi
Thu Jun 29 21:05:27 2023       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.30.02              Driver Version: 530.30.02    CUDA Version: 12.1     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                  Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf            Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 3080 L...    On | 00000000:01:00.0 Off |                  N/A |
| N/A   57C    P8               20W /  90W|    430MiB / 16384MiB |     35%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      2584      G   /usr/lib/xorg/Xorg                           85MiB |
|    0   N/A  N/A      3019      G   ...libexec/gnome-remote-desktop-daemon        3MiB |
|    0   N/A  N/A    501019    C+G   ...5545270,12857302865851419874,262144      339MiB |
+---------------------------------------------------------------------------------------+
hkchengrex commented 1 year ago

Single object mode is only relevant for training. During inference, single_object should always be false, no matter the number of input objects.