conda create -n vidstyleode python=3.10
conda activate vidstyleode
pip install -r requirements.txt
Please refer to RAVDESSand Fashion Dataset official websites for instructions on downloading the datasets used in the paper. You may also experiment with your own dataset. The datasets should be arranged with the following structure
Folder1
Video_1.mp4
Video_2.mp4
..
Folder2
Video_1.mp4
Video_2.mp4
..
It is recommended to extract the frames of the video for easier training. To extract the frames, please run the following command
python scripts/extract_video_frames.py \
--source_directory <path-to-video-directory> \
--target_directory <path-to-output-target-directory>
The output folder will have the following structure
Folder1_1
000.png
001.png
..
Folder1_2
000.png
001.png
..
1 x 18 x hidden_dims
.pt
files and arranged in a structure similar to the video frames.
Folder1_1
000.pt
001.pt
..
Folder1_2
000.pt
001.pt
..
To enable style editing, you need to provide a textual description for each training video. Please store these descriptions in a file named text_descriptions.txt
within the corresponding video frames folder. For example:
Folder1_1
000.pt
001.pt
..
text_descriptions.txt
.txt
file containing the video folder names for the training and validation.Prepare a .yaml
configuration file where you need to specify the video frames directory under img_root
, the W+ inversion folder under inversion_root
, and the training and validation txt
files under video_list
.
Our config files for the RAVDESS and Fashion Dataset are provided under the configs folder.
To start the training, run the following command:
python main.py --name <tag-for-your-experiment> \
--base <path-to-config-file>
To resume the training, run the following command
python main.py --name <tag-for-your-experiment> \
--base <path-to-config-file> \
--resume <path-to-log-directory> or <path-to-checkpoint>
By default, the training checkpoint and figures will be logged under logs
folder as well as into wandb. Therefore, please log in to wandb by running
wandb login
To generate image animation results by using the motion from a driving video, please run the following script
python scripts/image_animation.py
--model_dir <log-dir-to-pretrained-model> \
--n_samples <number-of-sample-to-generate> \
--output_dir <path-to-save-dir> \
--n_frames <num-of-frames-to-generate-per-video> \
--spv <num-of-dirving-videos-per-sample> \ # driving videos will be chosen randomly
--video_list <txt-file-of-possible-target-videios> \
--img_root <path-to-videos-root-dir> \
--inversion_root <path-to-frames-inversion-root-dir> \
Instructions will be added later.
Instructions will be added later.
Instructions will be added later.
If you find this paper useful in your research, please consider citing:
@misc{ali2023vidstyleodedisentangledvideoediting,
title={VidStyleODE: Disentangled Video Editing via StyleGAN and NeuralODEs},
author={Moayed Haji Ali and Andrew Bond and Tolga Birdal and Duygu Ceylan and Levent Karacan and Erkut Erdem and Aykut Erdem},
year={2023},
eprint={2304.06020},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2304.06020},
}