hkchengrex / XMem

[ECCV 2022] XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model
https://hkchengrex.com/XMem/
MIT License
1.72k stars 191 forks source link

Some questions about input image resolution #121

Closed 1334233852 closed 1 year ago

1334233852 commented 1 year ago

Thank you for your hard work!Due to the existence of working memory (high-resolution feature memory), the XMem network model supports data set image input of any resolution, such as DAVIS's 480x854, or other resolutions,thanks

hkchengrex commented 1 year ago

You can change the size (number of pixels of the shorter side) with an input argument: https://github.com/hkchengrex/XMem/blob/4589acce67dfd952b28f779f9e55a39ce8ebb9d6/eval.py#L63

1334233852 commented 1 year ago

For example, the resolution of my data set is 1920x1080 pixels, then I set default=1080? I think I got it, thank you!

hkchengrex commented 1 year ago

You can also set it to "-1" to leave the original resolution untouched. It might use a lot of memory/be slow/do not perform well at such a high resolution though, because it is not trained to do so.

1334233852 commented 1 year ago

Do I understand this correctly? If I use a 1080p resolution data set for training, I think I only need to fine-tune the path splicing during training, and then iterate the network model weights saved after training to predict, then I set default=1080 Should there be no problems of slow speed/high memory usage/poor segmentation accuracy? I don’t know if I understand it this way, I’m still studying your paper carefully! Maybe the question is a bit basic, thanks again, you helped a lot 路径

hkchengrex commented 1 year ago
  1. You would also need to change to crop size. Global search "384" in the project.
  2. It might help with the worse segmentation accuracy, but not the slow speed/high memory usage.
1334233852 commented 1 year ago

Regarding the 384x384 patches, I have seen them in STM papers before. Do you mean to adjust the patches a little bit and make them bigger? So how much is appropriate? 384 size patches are too small, causing one picture to be divided into many patches, which will lead to high memory utilization, right? Thank you for your continued replies! 384x384 例如,我全局搜索的关于384的这些都需要调整吗 - - !

hkchengrex commented 1 year ago

I don't know what patch sizes are good. This is just a general recommendation. Surely you can just try the current model first.

1334233852 commented 1 year ago

非常非常感谢!!!!您的回复非常非常有帮助!我会自己在尝试尝试!!!