NVIDIA / unsupervised-video-interpolation

Unsupervised Video Interpolation using Cycle Consistency
Other
107 stars 17 forks source link

How to trian the model on the Slowflow? #7

Open pongkun opened 3 years ago

pongkun commented 3 years ago

The file format of the Slowflow dataset is 'tif', but the provided code can not train directly.

fitsumreda commented 3 years ago

You need to post process the Slowflow images. If I remember correctly, the tif are RAW images and need to be post-processed into RGB formats

pongkun commented 3 years ago

Thanks. Can you tell me how to get the battlefield dataset?

fitsumreda commented 3 years ago

That dataset isn't publicly available due to IP.

pongkun commented 3 years ago

Which command should I use if I want to train my model on adobe30fps (not adobe240fps)? Thanks.

fitsumreda commented 3 years ago
# CC alone: Fully unsupervised training on Adobe30fps and evaluation on Val240fps dataset 
# There is no corresponding target scores for this run, as we have not done this experiment in the paper.

python3 -m torch.distributed.launch --nproc_per_node=16 train.py --model CycleHJSuperSloMo \
    --flow_scale 2.0 --batch_size 2 --crop_size 384 384 --print_freq 1 --dataset CycleVideoInterp \
    --step_size 1 --sample_rate 0 --num_interp 7 --val_num_interp 7 --skip_aug --save_freq 20 --start_epoch 0 \
    --train_file /path/to/Adobe30fps/train --val_file Val240fps/val --name unsupervised_adobe30fps --save /path/to/output 

# --nproc_per_node=16, we use a total of 16 V100 GPUs over two nodes.
pongkun commented 3 years ago

Thanks. Meanwhile, I want to know the corresponding command for the first experiment (on UCF101) in the paper. image

fitsumreda commented 3 years ago

I am not which exact experiment you are referring. The table above includes a supervised (baseline) training at high FPS and an unsupervised (proposed) adobe 30fps or battlefield 30fps. I think I shared above the adobe-30fps. Same command can be used for battlefield.

pongkun commented 3 years ago

I'm sorry, I didn't make myself clear. Now, I want to know how to train on adobe-30 and eval on ucf101. Do I need to adjust the num_interp to 1 or keep it at 7 during training? How should I set the sample_rate and step_size?

fitsumreda commented 3 years ago

Right. You should adjust the num_interp to 1, and check if you need to set --val_crop_size, since UCF images are smaller as compared to Adobe.

pongkun commented 3 years ago

So I don‘t need to adjust the sample_rate and step_size?Just adjust the num_interp? Now that I have the adobe240fps, should I adjust sample_rate to 8 to use adobe30fps?

fitsumreda commented 3 years ago
parser.add_argument('--sample_rate', type=int, default=1,
                    help='number of frames to skip when sampling input1, {intermediate}, and input2 \
                    (default=1, ie. we treat consecutive frames for input1 and intermediate, and input2 frames.)')
parser.add_argument('--step_size', type=int, default=-1, metavar="STEP_INTERP",
                    help='number of frames to skip from one mini-batch to the next mini-batch \
                    (default -1, means step_size = num_interp + 1')
parser.add_argument('--num_interp', default=7, type=int, metavar="NUM_INTERP",
                    help='number intermediate frames to interpolate (default : 7)')

If your adobe30fps root folder contains exactly 30 frames-per-second, then you can set --sample_rate=0, --step_size=1, --num_interp=1,--val_num_interp=1.

The flags --step_size and --sample_rate are relevant if you want to subsample a high FPS sequence (a root with 240fps sequence), but care only to train at the low FPS sequence (30fps), in that case set --sample_rate=8, and keep the rest the same. In the latter case, you can optionally set --step_size=8 to get a faster training.

pongkun commented 3 years ago

Thank you for the response, but I also have two questions.

  1. Now I have the sintel_1008fps dataset, I want to know how to get the sintel_24fps dataset. If I extract 1 frame from every 42 frames, then the frame count in the training set is not 551, can you tell me how to get 551 frames?
  2. I use this command to eval the pre-trained model but get a bad PSNR. python3 eval.py --model CycleHJSuperSloMo --num_interp 41 --flow_scale 2 --val_file /home/data/VFI/sintel_1008fps_for_paper/clean_1008fps/test --resume /home/wpk/VFI/code/pretrained_models/unsupervised_random2sintel.pth

image

fitsumreda commented 3 years ago

Sintel Eval Dataset: For each sequence/folder, select the first 43 frames to form an eval dataset. You can then do an eval for 41 intermediate frame synthesis. Sintel Training Dataset: For each sequence/folder, take the remaining frames (not used in the eval) to form the train dataset. Now, you have two options on how to consume this data. (A) Manually subsample the train sequence to a 24fps by deleting 41 frames, then set --sample_rate=0 (B) Directly use the high FPS train sequence and set --sample_rate=42, so you can subsample at the desired FPS (24fps), when grabbing mini-batches.

pongkun commented 3 years ago

I build the Sintel eval dataset and test the pre-trained model 'unsupervised_random2sintel.pth'. Is the result correct? PSNR==25.07 SSIM==0.786 IE==13.85

fitsumreda commented 3 years ago

You should get something close to 30 for PSNR. Look the table (Adobe->Sintel with CC) on the paper.

pongkun commented 3 years ago

However, I test your pre-trained models 'unsupervised_adobe2sintel.pth' and 'unsupervised_adobe+youtube2sintel.pth', and the results are not the same as the values on the paper. I am very puzzled.

unsupervised_adobe2sintel.pth: image unsupervised_adobe+youtube2sintel.pth: image

results on paper: image image

fitsumreda commented 3 years ago

OK, I will look into this. Where did you get the Sintel dataset?

pongkun commented 3 years ago

I get the Sintel dataset from 'http://www.cvlibs.net/projects/slow_flow/'. image image

This is my test dataset. image image