OndrejTexler / Few-Shot-Patch-Based-Training

The official implementation of our SIGGRAPH 2020 paper Interactive Video Stylization Using Few-Shot Patch-Based Training
607 stars 106 forks source link

Generating randomly distributed Gaussians for training #4

Open Albert-Miao opened 3 years ago

Albert-Miao commented 3 years ago

How are the randomly dotted and colored frames generated?

OndrejTexler commented 3 years ago

Hello. I did not have time to put the code to generate the gaussians together and push it to the repository.

But ... do you really need the gaussians? You can train it (and do the inference) without the gaussians. Gaussians add some more temporal stability when you do the inference, but it is not necessary and it works quite well even without the gaussians.

Albert-Miao commented 3 years ago

Sorry sorry, I don't need them, but I was curious if it existed in the repo and I was just missing something. Thanks!

OndrejTexler commented 3 years ago

I will keep this "issue" open ... and maybe one day, I will have time to polish and push the code to generate the Gaussians :-)

alessiapacca commented 3 years ago

hey @OndrejTexler! So I am working on the method without gaussian noise, but I would like it to be more temporal coherent. Do you have any suggestion for the generation of the gaussian noise? Is there a chance you upload the code for it?

Thanks!

OndrejTexler commented 3 years ago

Hello @alessiapacca and @Albert-Miao. I just updated the repo with the tools/scripts to generate gaussians, and tools/scripts to do time-aware bilateral filtering of input sequence.

Also, I added one large chapter into the Readme.md, Temporal Consistency [Optional], where it is discussed why the stylized results might be temporally unstable, and how to use the gaussians and bilateral pre-filtering tools/scripts in order to suppress flickering.

I hope you will find it useful. Let me know whether you were able to build/run bilateral filtering and generate the gaussians ... and whether it solved your temporal inconsistency problem!

alessiapacca commented 3 years ago

@OndrejTexler thank you so much for taking time to share it! I really appreciate it.

I will now try it and update you if I find any difficulty when using a Mac instead of Windows.

alessiapacca commented 3 years ago

@OndrejTexler I have a question about the gaussians: does it make any sense to use the gaussian if I have only 1 training image? I mean, the gaussians generation uses the optical flow, and I don't have any optical flow if I use only 1 training image. Even though I checked Zuzka1_train folder and I see that in that case you are using only 1 training image but still using the gaussian.

OndrejTexler commented 3 years ago

Good question, @alessiapacca. You can use guassians completely regardless of the number of training frames you are using; the only condition is that you have the entire video sequence upfront (meaning, it is not straightforward how to use the gaussians in a live use-case scenario).

Let's say you have a sequence 300 frames long. Compute the optical flow (the optical flow is computed between every two consecutive frames of the input sequence, not between multiple training samples). The optical flow is computed and saved for both directions, forward and backward. Once computed, you will have 299 .A2V2f files in flow_fwd and 299 .A2V2f files in flow_bwd. Next, generate the gaussians. As a mask, you can use the same frames you are using as training samples, but you do not have to (note, mask is probably not a good name for it, it is essentially just a leading frame that tells the tool where to "reset" gaussians). The best practice might be the following. Try to use the first image of your sequence as a mask, generate gaussians, and see how the gaussian sequence look. If there are large black holes, e.g., larger than 80x80 or 100x100 px, it means you should probably provide more masks (leading frames where gaussians will be reset). If there are holes, try to add one more mask image, maybe the last frame of your sequence, delete the previous gaussian sequence, and generate a new one. In the end, you will end up with one gaussian sequence and you will use that. Place the gaussian sequence next to the input folder in _gen folder. If you have just one training sample, for instance, frame number 120.png, take the frame 120.png from the gaussian sequence and place it to the corresponding folder in _train folder. If you have multiple training samples, take multiple images from the gaissuan sequence folder. I hope it make sense :-)

Anyway, gaussians can help the temporal stability of the resulting video, but they might introduce some new artifacts. On the other hand, the time-aware bilateral filtering is not "that powerfull" in reducing temporal noise, but it will not cause any new artifacts. So the best practice will be to first try time-aware bilateral filtering, and if it does not help (or helps but too little), combine bilateral filtering and gaussians.

alessiapacca commented 3 years ago

hey @OndrejTexler, at the end I was able to make it work on Linux, with some small changes around :) One question that I have though is: in the paper it's mentioned that the temporal coherency is anyway taken into consideration implicitly. Which are the reasons behind this? I also read the cited paper [Futschik et al 2019] but I cannot find a detailed explanation for this (the only thing I found was " our technique handles temporal coherency implicitly (see accompanying video demo) "). Do you have any suggestion?

OndrejTexler commented 3 years ago

Hello @alessiapacca, sorry for a bit late reply. I am glad you were able to run it on Linux.

Yes, we should have elaborated on this a bit more in the paper. The way how the network is designed, and the way how the network is trained encourages the network to implicitly output coherent frames (if the input frames are coherent). What I mean by that is that if you pass two similar input frames through the network and these two frames differ in "something", the output frames will be also similar to each other and differ in the similar "something". In other words, this is the feature of this kind of neural networks, they translate from the input continuous space to the output continuous space. Hmm ... my explanation is rather vague, but I hope you get what I am trying to say :-)

MazeRuiZhang commented 3 years ago

Hi @alessiapacca ,

I was following this thread and got many insights. Thank you for your good questions here and also @OndrejTexler 's detailed response!

I switched from Windows OS to Linux recently. The python env and codes were managed to stay the same during the transition. The training model can now run on the Linux machine smoothly. But the results on Linux are much different than the ones on Windows OS. Both the trained model.pth and visual results are so different. (The generating model works fine, as it still output nice results if I feed it with model.pth trained on the Windows OS)

Have you met the similar kind of problems when transfer the codes to Linux? And could I know what your small changes around are like? I spend 3 days to figure out the reasons, but still no luck. Same codes, same conda env, same CUDA version. But different OS Linux -> Windows, different GPU Tesla T4 -> GTX 2070.

Thank you so much!

Ray

OndrejTexler commented 3 years ago

Hello Ray!

Again, sorry for late reply. Were you able to figure this problem out?

That is certainly weird. The python part of the code should be multiplatform; in fact, I did run the training on several GPUs on both Windows and Linux. And I did not have any problems. There might be problems when you change torch version, but if you managed to have the same torch version, this should not be an issue (and you were able to use Windows model on Linux, so it means that torch versions should really be OK).

Apart from that, I do not know what good advice to give to you ... do some debugging, maybe start with image loading, etc.

Anyway, let me know!

alessiapacca commented 3 years ago

hey @MazeRuiZhang! Sorry I did not read your message before. I have never met any kind of problem in the training while working with the net. But I have never tried it on Windows. About the "small changes", I was referring mainly to the paths, that are hardcoded with the Windows notation. I just used os.path.join() in every hardcoded line.

MazeRuiZhang commented 3 years ago

Hi @alessiapacca!Thank you for your advice! I finally switched back to windows OS as the problem is like a phantom which is so hard to debug with. No worries!

janismdhanbad commented 3 years ago

Hi all, thanks for the helpful discussion. @alessiapacca, would it be possible to give some hints on making files to build the executables for disflow, bilateralAdv, and gauss? I don't have much experience with scripting and would really appreciate it. Thanks!

alessiapacca commented 3 years ago

@janismdhanbad Hey! I just followed the instructions in the readme. In case you are running it on Linux, you have to Change the paths in the scripts,since they are hardcoded in Windows notation. I just replaced them with os.path.join() and then they worked. Let me know if you have specific doubts so I can see if I can help you.

janismdhanbad commented 3 years ago

Thanks for the reply @alessiapacca . So, the way I making their executables in Linux is by changing the .bat files to appropriate .sh files but I am facing some issues because some commands from .bat are not working in Linux. Is that a valid approach? Thanks!

alessiapacca commented 3 years ago

@janismdhanbad Right now I don't have the code in front of me, in case I can have a look at it on Monday. Anyway, if I remember correctly, I just compiled the cpp files with g++, and then obtained the .out files. Then you can run the scripts .py by properly changing the names and paths inside.

Let me know if you can do it this way. Otherwise I will have a closer look on Monday.

janismdhanbad commented 3 years ago

Thanks @alessiapacca, that makes sense. I will try it and get back to you. Thanks again :)

omerbt commented 1 year ago

Hi! any chance that you can please share the linux compatible implementation of the Gaussians? @alessiapacca

MichalGeyer commented 1 year ago

Hey! Joining @omerbt's request- it would be wonderful if you could share it :) @alessiapacca