De-synchronized frames after quantization and decoding

haochen-rye / HNeRV

Official Pytorch implementation for HNeRV: a hybrid video neural representation (CVPR 2023)

114 stars 15 forks source link

I have tried encoding and decoding a video using the reference software and it seems that, in the comparisons generated, original and quantized decoded frames are not synchronized. This happens when decoding the video 'bunny' using the provided weights as well. This is the comparison image for the first frame, named "pred_0000_13.83.png":

pred_0000_13 83

I have run the following command, which is the one reported in the README:

python train_nerv_all.py  --outf 1120  --data_path data/bunny --vid bunny      --conv_type convnext pshuffel --act gelu --norm none  --crop_list 640_1280      --resize_list -1 --loss L2  --enc_strds 5 4 4 2 2 --enc_dim 64_16     --dec_strds 5 4 4 2 2 --ks 0_1_5 --reduce 1.2      --modelsize 1.5  -e 300 --eval_freq 30  --lower_width 12 -b 2 --lr 0.001    --eval_only --weight checkpoints/hnerv-1.5m-e300.pth    --quant_model_bit 8 --quant_embed_bit 6     --dump_images --dump_videos

The GIF file is not synchronized as well. This problem does not seem to affect the unquantized predictions. What could the problem be? I have installed required dependencies using the provided file.

Hardware specifications:

GPU: Tesla K80 Driver Version: 470.141.03 CUDA Version: 11.4

For video decoding, we run two models (un-quantized and quantized one) on the full_dataloader, for quantized model, we use the de-quantized frame embed from quantized one. https://github.com/haochen-rye/HNeRV/blob/4872129c8d004a25477e0c1ffbbff4ba71943ad5/train_nerv_all.py#L388 Since we shuffled frames for full_dataloader, the resulting de-quantized frame embed (via un-quantized model) is shuffled as well. https://github.com/haochen-rye/HNeRV/blob/c13d72b7d4e0bad4a11418c15b9a024a013d5109/train_nerv_all.py#L156 The decoding frames for quantized model (input de-quantized embed by un-quantized model) is therefore shuffled. https://github.com/haochen-rye/HNeRV/blob/4872129c8d004a25477e0c1ffbbff4ba71943ad5/train_nerv_all.py#L404 We fix the frame order now for full_dataloader, it should work well now. https://github.com/haochen-rye/HNeRV/blob/4872129c8d004a25477e0c1ffbbff4ba71943ad5/train_nerv_all.py#L156

haochen-rye / HNeRV

De-synchronized frames after quantization and decoding #5