Open moldach opened 3 years ago
load_videos.py
already download videos that you can use for training. You don't need to run additional crop_vox.py
, it is only needed if you want to change cropping strategy.
Hi @AliaksandrSiarohin thank you very much for your quick response.
Sorry, I was mistakenly thinking that you needed to change the crop-size of the videos in order to improve resolution.
However, looking at issue #20 in first-order-model
- The only reliable methods is to retrain on high resolution videos.
So these videos downloaded with load_videos.py
to train first-order-model
are not high-resolution, correct?
This may tie-into my question about generating the checkpoints (at the bottom) later.
Now unfortunately, I ran into an issues with your initial suggestion # 2, to try Face-Image-Motion-Model.
So, next, I tried this workaround suggestion; however, I noticed that the mouth was not moving like reported here
- Since all the networks are fully convolutional you can actually try to use pretrained checkpoints , trained on 256 images. In order to do this change the size in:
first-order-model/demo.py
source_image = resize(source_image, (1024, 1024))[..., :3]
driving_video = [resize(frame, (1024, 1024))[..., :3] for frame in driving_video]
config/vox-256.yaml
# Line 26 in 2ed57e0
scale_factor: 0.0625
# Line 38 in 2ed57e0
scale_factor: 0.0625
models/util.py
sigma = 1.5
So, given these results, I suppose I'm echoing the recent question of dreammonkey who asked about how to re-train 512x512 (or higher) checkpoints.
Has anyone re-trained larger check-points? Is it worth it? And if so, how would I go about doing this?
Or, do you think it's more beneficial to tackle # 2, my issue with Face-Image-Motion-Model or try some sort of up-scaling (e.g. video2x
, Topaz Video Enhance AI
, etc.) on the 256x256
output?
Load videos crop the bboxes provided in vox_Metadata and then resize them to 256x256. You can change this and resize to 512x512 during downloading.
I'm trying to run the preprocess section of the script but running into issues with no discernable error to go off-of.
I was able to successfully run
python load_videos.py --metadata vox-metadata.csv --format .mp4 --out_folder vox --workers 8
which I see downloaded18,334
.mp4's to thevox/train
subdirectory:$ pwd /scratch/moldach/my-thesis-project/vox/train $ ls | head -n 1 id10001#7w0IBEWc9Qw#000993#001143.mp4 $ ls | wc -l 18334
I've also unzipped the vox1-annotations to
txt/
:$ cd txt/ $ ls | head id10001 id10002
However, running
crop_vox.py
is not generating any output files (am I expecting cropped videos in thevideos/
directory?):Submission script
#!/bin/bash #SBATCH --job-name=preprocess_bcri # Job name #SBATCH --mail-type=END,FAIL # Mail events (NONE, BEGIN, END, FAIL, ALL) #SBATCH --mail-user=moldach@ucalgary.ca # Where to send mail #SBATCH --nodes=1 # Request a P100-16G GPU node on Cedar ## This has Four Tesla P100 16GB cards #SBATCH --gres=gpu:p100l:4 #SBATCH --ntasks=1 #SBATCH --cpus-per-task=24 # There are 24 CPU cores on P100 Cedar GPU nodes #SBATCH --mem=0 # Request the full memory of the node #SBATCH --time=01:00:00 # Time limit hrs:min:sec #SBATCH --output=%x_%j.log # Standard output #SBATCH --error=%x_%j.log # Standard error pwd; hostname; date source venv/bin/activate echo "Running preprocessing (assuming 4 GPU, and 6 workers per GPU)" python crop_vox.py --workers 24 --device_ids 0,1,2,3 --format .mp4 --dataset_version 1 date
preprocess_bcri_chantal_64208433.err
^M0it [00:00, ?it/s]
preprocess_bcri_chantal_64208433.log
/scratch/moldach/my-thesis-project cdr903.int.cedar.computecanada.ca Fri Mar 19 18:44:11 PDT 2021 Running preprocessing (assuming 4 GPU, and 6 workers per GPU) Fri Mar 19 18:51:01 PDT 2021
Do you have any suggestions for debugging this?
May I ask how did you successfully download the video, I also followed this line of code python load_videos.py --metadata vox-metadata.csv --format .mp4 --out_folder vox --workers 8 runs, but 257it [75:29:08,. 966.54s/it] Can not load video 75sBThtNTdo, broken link This keeps on happening without any video being downloaded, what should I do to successfully download the video, can you tell me your method?
I'm trying to run the preprocess section of the script but running into issues with no discernable error to go off-of.
I was able to successfully run
python load_videos.py --metadata vox-metadata.csv --format .mp4 --out_folder vox --workers 8
which I see downloaded18,334
.mp4's to thevox/train
subdirectory:I've also unzipped the vox1-annotations to
txt/
:However, running
crop_vox.py
is not generating any output files (am I expecting cropped videos in thevideos/
directory?):Submission script
preprocess_bcri_chantal_64208433.err
preprocess_bcri_chantal_64208433.log
Do you have any suggestions for debugging this?