k4yt3x / video2x

A machine learning-based lossless video super resolution framework. Est. Hack the Valley II, 2018.
https://video2x.org
GNU Affero General Public License v3.0
10.83k stars 1.01k forks source link

Add support for Real-ESRGAN #1102

Closed aa-ko closed 4 months ago

aa-ko commented 6 months ago

Hi everyone,

a few months ago, I wanted to upscale some old DVDs and tested different algorithms in the process. In my opinion, the only algorithm I was able to find, that consistently produces better looking results than even RealSR was Real-ESRGAN.

The changes in this PR were enough to make it work for me and the results are amazing. There are some caveats though:

  1. Choosing Real-ESRGAN completely ignores the noise flag.
  2. Since I wanted to work on real life footage and not anime, the only model I tested was realesrgan-x4plus.
  3. I don't have a lot of experience working with Python, so I tested this in a Podman container only!

I would like to see Real-ESRGAN officially supported in video2x and I believe there already were some people asking about it in the past, so I'd really appreciate some feedback and/or guidance to make this work for everyone! :)

rdwz commented 6 months ago

Hi, I've just come across this newer version, haven't had the chance to try it out myself yet, but it has Real-ESRGAN support

https://github.com/AaronFeng753/Waifu2x-Extension-GUI/releases/tag/v3.113.01

Video, Image and GIF upscale/enlarge(Super-Resolution) and Video frame interpolation. Achieved with Waifu2x, Real-ESRGAN, Real-CUGAN, RTX Video Super Resolution VSR, SRMD, RealSR, Anime4K, RIFE, IFRNet, CAIN, DAIN, and ACNet.

aa-ko commented 5 months ago

@rdwz Thanks for the suggestion, but that project seems to be Windows only and does not provide a CLI, which makes it not really suitable for my use case.

arximboldi commented 4 months ago

I just tried this PR and it is working for me.

However, the model seems to run way slower than realcugan.

According to this it should be much faster for anime, when using the right model: https://github.com/xinntao/Real-ESRGAN/blob/master/docs/anime_comparisons.md

Would you mind adding options to use the other models as well?

arximboldi commented 4 months ago

@aa-ko I've made a commit to add support to the other models, not sure whether to start another PR, or maybe you'd like to integrate this in yours and clean it up so it can be merged?

https://github.com/arximboldi/video2x/commit/14799c501de71aff8075b16d122a82109e43ced8

Doing some tests with it and on anime realesr-animevideov3 is giving me excellent results: similar quality to realcugan but way faster. And a bit more loyal to the original, arguably less artifacts in most frames. I hope @k4yt3x finds a way to merge this since this algorithm is great!

arximboldi commented 4 months ago

Btw, I have this issue where I get lots of percentage messages flooding the output. I have this happen also with realsr but not with other algorithms. @aa-ko or @k4yt3x, have you got a way to solve this? Maybe somehow redirecting the output of the underlying function?

arximboldi commented 4 months ago

Ok, so I solved the issue with the output flooding: https://github.com/arximboldi/video2x/commit/67e3c1654ccb64a53f8fd1620ef96c208a62fdf5

The problem with this is that maybe there are legitimate errors that we may miss? Maybe we could do something more sophisticated like actually parsing the output and looking for the keyword error or something? But for now this makes usage way more pleasant...

aa-ko commented 4 months ago

Hi @arximboldi, thank you so much for testing and improving this!

I've made a commit to add support to the other models, not sure whether to start another PR, or maybe you'd like to integrate this in yours and clean it up so it can be merged?

Since this is directly related to realesr, I think it belongs in this PR. Let me test this real quick and I'll cherry-pick your commits over.

Btw, I have this issue where I get lots of percentage messages flooding the output. I noticed this as well, very annoying! I wanted to get the upscaler working at first, so I did not investigate further.

Looks like you already fixed this though, very nice :+1:

arximboldi commented 4 months ago

I'm still working on this (I kinda shouldn't because I have a million other things to do, but the ways of procastination are mysterious). I am kinda stuck with this now: https://github.com/k4yt3x/video2x/issues/780 Have you experienced that?

To be able to debug this first I've had to make the dev setup work without Docker for me as otherwise the turn-around time to try things is disastrous. This was a bit tricky due to NixOS. You can check the other commits for this branch in case it's interesting for you also, I think you mentioned you were trying things inside Docker only as well.

aa-ko commented 4 months ago

No worries, I got sidetracked myself, because I was trying to improve the image build time. I agree that it's super annoying to test code changes in the container. A full build takes ~15m on my machine. Guess I'll just have to setup running this locally now.

I don't remember any bug as described in #780, but I just pulled all commits from your branch and now get stuck completely before any upscaler is even called. There is no output after setting up the ffmpeg pipe:

podman run -it --rm --device=/dev/dri -v $PWD:/host localhost/k4yt3x/video2x:realesrgan -i sample03.mp4 -o sample03_test.mp4 -p3 -l trace upscale -h 2304 -a realesrgan-x4plus

--- SNIP ---

Finished splitting the commandline.
Parsing a group of options: global .
Applying option hide_banner (do not show program banner) with argument 1.
Applying option nostats (print progress report during encoding) with argument 0.
Applying option loglevel (set logging level) with argument trace.
Applying option y (overwrite output files) with argument 1.
Successfully parsed a group of options.
Parsing a group of options: input url pipe:0.
Applying option f (force format) with argument rawvideo.
Applying option pix_fmt (set pixel format) with argument rgb24.
Applying option r (set frame rate (Hz value, fraction or abbreviation)) with argument 25.0.
Applying option s (set frame size (WxH or abbreviation)) with argument 2800x2304.
Applying option thread_queue_size (set the maximum number of queued packets from the demuxer) with argument 64.
Successfully parsed a group of options.
Opening an input file: pipe:0.
[rawvideo @ 0x5aa54c2eb800] Opening 'pipe:0' for reading
[pipe @ 0x5aa54c2ec240] Setting default whitelist 'crypto'
aa-ko commented 4 months ago

@arximboldi Thank you for all these improvements! I'll abandon this in favor of #1133 now.

arximboldi commented 4 months ago

Hmmm, is the issue still reproducible with the latest changes from #1133? In that case, can you send me the sample file and see if I can reproduce the problem? You can probably find my email in the git commits...