m-tassano / fastdvdnet

FastDVDnet: A Very Fast Deep Video Denoising algorithm
MIT License
589 stars 125 forks source link

Can you help me to reproduce your result? #13

Open dustlrdk opened 4 years ago

dustlrdk commented 4 years ago

I'm going to reproduce your results. There are two questions in the process.

  1. How to make "set8"? I take derf video file from this git. It generates the same resolution of the video(960 X 540) as your paper used.

https://github.com/cmla/derf-hd-dataset

And I used ffmpeg4 to extract images from the video.

ffmpeg -i input.mp4 out%d.png

But your weight file("model.pth") shows better results than your paper.

Is there any wrong in my implementation?

  1. My training result is worse than your wight file. I used your training file, which is uploaded on dropbox. And then set batchsize-96 / max_number_patches 384000 / noise_ival [5, 50]. The rest used default settings. I used 3 2080 RTX GPU for this training.

The result is this. -Our set8 with sigma 50 (Yours - 30.36 / our best 29.96 )

So... Can you help me with how to reproduce your result?

dustlrdk commented 4 years ago

I fixed some problem in my "set8". Now I get 29.46dB in "set8" with your weights. (paper is 29.53dB/ +0.07dB ) I think this difference is reasonable because seed for noise is not fixed.

But still my weight shows 29.09dB in fixed set8 (Your's:29.46 dB / +0.37dB ). It is non-trivial difference.

Is there any tip to get your result?

m-tassano commented 4 years ago

Hi dustlrdk, sorry for the late response, I've been a bit busy. I'm uploading the whole Set8 testset to the folder linked in the README file so you can try running the algorithm with them.

By the way, the PSNR displayed on the paper is the average of the PNSR of each sequence in the testset. The PSNR of each sequence is the average of the PSNR of all frames.

So for your second question, as far as I understand, you try to train the model and you get different results. As you know, for the training you need mp4 files with the davis sequences. When converting the sequences one has to pay particular attention to the 'crf' and 'keyint' ffmpeg parameters to avoid strong compression. For the code to convert the image sequences see this gist https://gist.github.com/m-tassano/0536391eb79d63864e5005ea4da88243

hope this helps

dustlrdk commented 4 years ago

Thank you for your sincere reply.

I re-run test code with uploaded data. I got a result(our 29.4586dB | your paper 29.53dB).(You said you used "macro averages," didn't you?)

dir=./data/$1

for f in "$dir"/; do i=${f##/} CUDA_VISIBLE_DEVICES=$3 python test_fastdvdnet.py --test_path $f --noise_sigma $2 --save_path results/$1/sigma$2/$i/ --max_num_fr_per_seq 85 done

for f in "$dir"/; do i=${f##/} cat results/$1/sigma$2/$i/log.txt | grep PSNR | cut -f9-9 -d" " | cut -f1 -d"d" done

Can you try this shell code with your code and weight?

I got this.

30.2415 26.7901 26.9022 28.791 28.3301 33.0974 31.9209 29.5933

Average : 29.4583125

And I used mp4 files in training folder you uploaded on the dropbox, do I have to make new mp4 training files instead?

djmth commented 4 years ago

I use the training mp4 from your dropbox, and set all the parameters the same as your paper mentioned using your training code. I have trained at least ten models but none of them achieve the performance of your provided model or the results on your paper. The set8 results with sigma 50 are as follows (the format is * test sequence name: PSNR of the provided model.pth vs PSNR of my trained model):

djmth commented 4 years ago

And using my own training mp4 gets higher results than the mp4 downloaded from your dropbox. I produce the mp4 using the code "ffmpeg -f image2 -i %05d.jpg xxx.mp4".

Syzygianinfern0 commented 3 years ago

@djmth Anyone has reproduced the results?

Our results seem quite similar. I too have abnormally low performance on sigma=10 (even worse than yours).

Model Name Noise Level Daivs Test Set 8
Authors Paper 10 38.71 36.44
  20 35.77 33.43
  30 34.04 31.68
  40 32.82 30.46
  50 31.86 29.53
Authors Weights 10 39.2439 36.4054
  20 36.1022 33.3794
  30 34.2952 31.6226
  40 33.0215 30.3977
  50 32.0329 29.4577
My Training 10 30.2144 👈 26.9875 👈
  20 35.6938 33.0682
  30 33.8302 31.2819
  40 32.5281 30.0381
  50 31.5278 29.0777

@m-tassano

  1. The training dataset provided in your DropBox is actually incomplete (only 86 sequences of 90 are present). I proceeded to recreate the training dataset by downloading from DAVIS's official site and then using your gist with the default parameters to create it.
  2. My Set8 is the same as your dropbox link (4 folders from gopro and 4 folders from derf).

I am unable to reproduce your results, and this feels to be a prevalent issue. I request you, if possible, as the author, try to confirm these by recreating your experiments from this repo and maybe there is indeed a possible bug somewhere in this repo.

m-tassano commented 3 years ago

Hi,

I agree that those PSNRs look abnormal. The reason of this performance might come from a training which did not converge well. Are those the results of only one training or do you see those figures repeating along different experiments on your side?

Those 4 sequences were removed because their resolution is different from the rest of the sequences (shooting:1152x480, disc-jockey:1138x480, cat-girl:911x480, bike-packing:910x480). The reason behind this is that DALI does not support inputs with different resolutions, or at least the DALI version used when I first uploaded the project. That means, the provided training dataset is correct and the one used to train the provided weights.

Another factor which might introduce variations is the way you compute the PSNRs of each frame and how you accumulate the scores of all frames of all sequences. You can find the script I used to compute the metrics and the tables in the paper [here|https://www.dropbox.com/s/wei8uhym1i4viot/make_tables_allalgos.ipynb?dl=0]. You are welcome to use the code with your results.

Lastly, I was recently forced to update the supported DALI version because the original version which was employed when the method was published is no longer supported by NVIDIA. I have not tested this new DALI version thoroughly and it might have an impact on the results. I will try to run a training in the following days if I can find the time.

Syzygianinfern0 commented 3 years ago

Hello. Thanks for your reply!

Are those the results of only one training

My results are from a single experiment. Although, since most results reported by others on this repo seem similar, I feel we could rule out the possibility of a single bad experiment.

the provided training dataset is correct and the one used to train the provided weights

Thanks for this clarification! Oh by the way, the dataloader from the new DALI version was able to load these anomalous sequences as well!

You can find the script I used to compute the metrics and the tables in the paper

Great. I shall try computing metrics using this script.

I will try to run a training in the following days if I can find the time

Thank you very much. I really appreciate it.

Syzygianinfern0 commented 3 years ago

@m-tassano How were the run-times calculated?. Was it similar to the macro-averages in the case of PSNRs (average of averages of a video sequence)?

https://github.com/m-tassano/fastdvdnet/blob/3afbd0d30f9171b368c80291522033182923f434/test_fastdvdnet.py#L117

If it was in some other way is it possible to share the script/notebook/method you used to calculate the run-times.

Edit: Also, which test sequence did you use to calculate run-times? Set-8 or DAVIS-Test?

gulzainali98 commented 3 years ago

I also have the similar problem and my results seem to quite similar to some of the results mentioned here i.e I got 31.257 PSNR on noise=30 on set8.

gulzainali98 commented 3 years ago

even when i run the trained model the results are not completely similar. I get 31.60 something vs 31.68 mentioned in the paper.

kavita19 commented 3 years ago

Hello @m-tassano I am trying to train model with same training mp4 files but after 1epoch it gives me ZeroDivisionError. I kept validation images in validation folder but still gives this error. can you help me to solve this problem. I am using your updated release code.

File "train_fastdvdnet.py", line 212, in main(**vars(argspar)) File "train_fastdvdnet.py", line 147, in main trainimg=img_train File "/home/knuvi/Desktop/Kavita/fastdvdnet-0.1/train_common.py", line 130, in validate_and_log psnr_val /= len(dataset_val) ZeroDivisionError: division by zero

m-tassano commented 3 years ago

Hi @kavita19 From reading the error you sent, it seems your problem comes from the fact that the length of the validation dataset is zero. That might might happen for example if you're passing an incorrect path to the --valset_dir in train_fastdvdnet.py so that when dataset_val in line 31 (dataset_val = ValDataset(valsetdir=args['valset_dir'], gray_mode=False)) no images are found. The argument dataset_val needs to be set to the path of the folder with the validation images.

kavita19 commented 3 years ago

@m-tassano @Syzygianinfern0 For validation images i have added 15 images (own images with size 960x540) in my validation dir but when I run train_fastdvdnet.py I am getting below problem now. can you help me to solve this. Also i bit confused about testing images (how many images per sequence tests this model)?

/home/knuvi/anaconda3/envs/fastdvdnet2/lib/python3.6/site-packages/nvidia/dali/plugin/base_iterator.py:162: Warning: Please set reader_name and don't set last_batch_padded and size manually whenever possible. This may lead, in some situations, to missing some samples or returning duplicated ones. Check the Sharding section of the documentation for more details. _iterator_deprecation_warning() Traceback (most recent call last): File "train_fastdvdnet.py", line 212, in main(**vars(argspar)) File "train_fastdvdnet.py", line 38, in main temp_stride=3) File "/home/knuvi/Desktop/Kavita/fastdvdnet-0.1/dataloaders.py", line 110, in init auto_reset=True) File "/home/knuvi/anaconda3/envs/fastdvdnet2/lib/python3.6/site-packages/nvidia/dali/plugin/pytorch.py", line 183, in init self._first_batch = DALIGenericIterator.next(self) File "/home/knuvi/anaconda3/envs/fastdvdnet2/lib/python3.6/site-packages/nvidia/dali/plugin/pytorch.py", line 194, in next outputs = self._get_outputs() File "/home/knuvi/anaconda3/envs/fastdvdnet2/lib/python3.6/site-packages/nvidia/dali/plugin/base_iterator.py", line 255, in _get_outputs outputs.append(p.share_outputs()) File "/home/knuvi/anaconda3/envs/fastdvdnet2/lib/python3.6/site-packages/nvidia/dali/pipeline.py", line 863, in share_outputs return self._pipe.ShareOutputs() RuntimeError: Critical error in pipeline: Unknown error Current pipeline object is no longer valid.

Syzygianinfern0 commented 3 years ago

Current pipeline object is no longer valid.

I have had this error when I did not have the CUDA environment properly set up, but I am not sure if that is the only cause of the issue. Make sure your installed CUDA version matches the DALI CUDA version.

Syzygianinfern0 commented 3 years ago

https://docs.nvidia.com/deeplearning/dali/user-guide/docs/installation.html#nvidia-dali

You can choose the DALI CUDA version you want to install here 👆

kavita19 commented 3 years ago

@m-tassano @Syzygianinfern0 ok my current version of cuda is 11.4 and I install cuda for 11. no I am getting cuda out of memory problem. Do u have any solution this. because I tried by reducing batch size but it doesn't work.

/home/knuvi/anaconda3/envs/fastdvdnet2/lib/python3.6/site-packages/nvidia/dali/plugin/base_iterator.py:161: Warning: Please set reader_name and don't set last_batch_padded and size manually whenever possible. This may lead, in some situations, to missing some samples or returning duplicated ones. Check the Sharding section of the documentation for more details. _iterator_deprecation_warning() Traceback (most recent call last): File "train_fastdvdnet.py", line 212, in main(**vars(argspar)) File "train_fastdvdnet.py", line 38, in main temp_stride=3) File "/home/knuvi/Desktop/Kavita/fastdvdnet-0.1/dataloaders.py", line 110, in init auto_reset=True) File "/home/knuvi/anaconda3/envs/fastdvdnet2/lib/python3.6/site-packages/nvidia/dali/plugin/pytorch.py", line 190, in init self._first_batch = DALIGenericIterator.next(self) File "/home/knuvi/anaconda3/envs/fastdvdnet2/lib/python3.6/site-packages/nvidia/dali/plugin/pytorch.py", line 238, in next device=category_device[category]) RuntimeError: CUDA error: out of memory

Syzygianinfern0 commented 3 years ago

What is your GPU memory capacity? Is it out of memory even with batch size of 1?

I remember being able to have the code run even on my laptop GPU which is just 4 GB.

kavita19 commented 3 years ago

@Syzygianinfern0 Thanks. My cuda memory problem is solved after restarting.

How many validation images we can test ? because I have converted .yuv video format to .png images (total 100 frames for validation). After setting my environment I started training but after one epoch I am getting zerobydivision error.

[epoch 1][1981/2000] loss: 15.6695 PSNR_train: 0.0000 [epoch 1][1991/2000] loss: 15.7777 PSNR_train: 0.0000 Traceback (most recent call last): File "train_fastdvdnet.py", line 212, in main(**vars(argspar)) File "train_fastdvdnet.py", line 147, in main trainimg=img_train File "/home/knuvi/Desktop/Kavita/fastdvdnet-0.1/train_common.py", line 130, in validate_and_log psnr_val /= len(dataset_val) ZeroDivisionError: division by zero


this is command i am giving to for training model but I dont know why My validation images not getting exact path thats why this error occur.

(fastdvdnet2) knuvi@DGX-Station:~/Desktop/Kavita/fastdvdnet-0.1$ python train_fastdvdnet.py --batch_size 128 --epochs 80 --patch_size 96 --trainset_dir "/home/knuvi/Desktop/Kavita/fastdvdnet-0.1/img/train/" --valset_dir "/home/knuvi/Desktop/Kavita/fastdvdnet-0.1/img/basketball100/" --log_dir logs

Syzygianinfern0 commented 3 years ago

@kavita19

.yuv video format to .png images

Is this some custom dataset?

You can first try just trying to run the code using the provided training and validation datasets. And if that works, you can compare how your custom dataset's structure is looking like with respect to the provided validation dataset structure.

By the way, the validation set directory should point to a folder containing multiple folders which contain images with file names in sequential order. image

kavita19 commented 3 years ago

@Syzygianinfern0 Ok for the validation set which images you have used? Because I Have tried with Derf 480p testset ("tractor") but it gives me same problem. Also the image should be jpg or png ? and I save my images in sequential order but when I put in validation folder it shuffles.

image

Syzygianinfern0 commented 3 years ago

Are you using the command this way --valset_dir img/?

Please refer to the last sentence here: https://github.com/m-tassano/fastdvdnet/issues/13#issuecomment-945406347

kavita19 commented 3 years ago

I am using --valset_dir "./home/knuvi/Desktop/Kavita/fastdvdnet-0.1/img/tractortest"

Is it wrong ?can u share me your command for training that you have used?

Syzygianinfern0 commented 3 years ago

Can you please try it as --valset_dir "./home/knuvi/Desktop/Kavita/fastdvdnet-0.1/img/"

kavita19 commented 3 years ago

Hey sorry i forgot to inform you . The above command works for me. Thank you so much

kavita19 commented 3 years ago

@m-tassano In fastdvdnet model (Model.py) number of input are 5 and it takes 3 frames to pass denoising blocks. Similarly if I want to takes only 2frames from inputs is it possible and what changes required in model.py?

m-tassano commented 3 years ago

@kavita19 please create a new issue for a separate topic

kavita19 commented 2 years ago

@m-tassano @Syzygianinfern0 As I read in the paper runtimes shows 0.1s to denoise image. But when I tested snowboard sequence runtime is (Denoised 59 frames in 3.932s, loaded seq in 6.258s). Please can you clear me paper runtime result is based on one image or all frames? and which dataset you used to calculate runtime.

m-tassano commented 2 years ago

@kavita19 I'll try to answer to your question the best I can because I find it's not completely clear.

As mentioned in the paper, the reported time is the time it takes to denoise a color frame of resolution 960×540. So, it doesn't really matter which dataset you use as long as its frames have this resolution--I believed I used one of the sequences in the Set8 dataset, in any case.

The runtime estimation was computed as the total time over the number of frames in the sequence. The same was used for all the algorithms mentioned. Note that if you denoise a sequence with more frames, the runtime per frame will likely decrease, as there's an overhead when loading the input sequence.

Hope this helps

kavita19 commented 2 years ago

@m-tassano Thank you for your reply. for validation which dataset you used and how many png images you were used in validation ?

m-tassano commented 2 years ago

Please note that the performance of the model trained with the old DALI version (the one originally used to train the weights shared in the Github) has been reported to be superior than the one obtained with the new DALI version. Or inversely, the new DALI version is linked to a drop in the performance of the model. See https://github.com/m-tassano/fastdvdnet/issues/51 for more details.