Eliminating Warping Shakes for Unsupervised Online Video Stitching
We have released the complete code of StabStitch++ (an extension of StabStitch) with better alignment, fewer distortions, and higher stability.
It contains the codes for training, inference, and multi-video stitching.
This is the official implementation for StabStitch (ECCV2024).
Lang Nie1, Chunyu Lin1, Kang Liao2, Yun Zhang3, Shuaicheng Liu4, Rui Ai5, Yao Zhao1
1 Beijing Jiaotong University {nielang, cylin, yzhao}@bjtu.edu.cn
2 Nanyang Technological University
3 Communication University of Zhejiang
4 University of Electronic Science and Technology of China
5 HAMO.AI
Feature
Nowadays, the videos captured from hand-held cameras are typically stable due to the advancements and widespread adoption of video stabilization in both hardware and software. Under such circumstances, we retarget video stitching to an emerging issue, warping shake, which describes the undesired content instability in non-overlapping regions especially when image stitching technology is directly applied to videos. To address it, we propose the first unsupervised online video stitching framework, named StabStitch, by generating stitching trajectories and smoothing them. The above figure shows the occurrence and elimination of warping shakes.
Video
Here, we provide a video (released on YouTube) to show the stitched results from StabStitch and other solutions.
The details of the dataset can be found in our paper. (arXiv)
The dataset can be available at Google Drive or Baidu Cloud(Extraction code: 1234).
We implement StabStitch with one GPU of RTX4090Ti. Refer to environment.yml for more details.
The pre-trained models (spatial_warp.pth, temporal_warp.pth, and smooth_warp.pth) are available at Google Drive or Baidu Cloud (extraction code: 1234). Please download them and put them in the 'model' folder.
Modify the test_path in Codes/test_online.py and run:
python test_online.py
Then, a folder named 'result' will be created automatically to store the stitched videos.
About the TPS warping function, we set two modes to warp frames as follows:
You can change the mode here.
Modify the test_path in Codes/test_metric.py and run:
python test_metric.py
To test the model generalization, we adopt the pre-trained model (on the StabStitch-D dataset) to conduct some tests on traditional video stitching datasets. Surprisingly, it severely degrades and produces obvious distortions and artifacts, as illustrated in Figure (a) below. To further validate the generalization, we collect other video pairs from traditional video stitching datasets (over 30 video pairs) and retrain our model in the new dataset. As shown in Figure (b) below, it works well in the new dataset but fails to produce natural stitched videos on the StabStitch-D dataset.
We found that performance degradation mainly occurs in the spatial warp model. Without corrected spatial warps, the subsequent smoothing process will amplify the distortion.
It then throws a question about how to ensure the model generalization in learning-based stitching models. A simple and intuitive idea is to establish a large-scale real-world stitching benchmark dataset with various complex scenes. It should benefit various stitching networks in the generalization. Another idea is to apply continuous learning to the field of stitching, enabling the network to work robustly across various datasets with different distributions
These are just a few simple proposals. We hope you, the intelligent minds in this field, can help to solve this problem and contribute to the advancement of this field. If you have some ideas and want to discuss them with me, please feel free to drop me an email. I’m open to any kinds of collaboration.
If you have any questions about this project, please feel free to drop me an email.
NIE Lang -- nielang@bjtu.edu.cn
@article{nie2024eliminating,
title={Eliminating Warping Shakes for Unsupervised Online Video Stitching},
author={Nie, Lang and Lin, Chunyu and Liao, Kang and Zhang, Yun and Liu, Shuaicheng and Zhao, Yao},
journal={arXiv preprint arXiv:2403.06378},
year={2024}
}
[1] S. Liu, P. Tan, L. Yuan, J. Sun, and B. Zeng. Meshflow: Minimum latency online video stabilization. ECCV, 2016.
[2] L. Nie, C. Lin, K. Liao, S. Liu, and Y. Zhao. Unsupervised Deep Image Stitching: Reconstructing Stitched Features to Images. TIP, 2021.
[3] L. Nie, C. Lin, K. Liao, S. Liu, and Y. Zhao. Parallax-Tolerant Unsupervised Deep Image Stitching. ICCV, 2023.