styler00dollar / VSGAN-tensorrt-docker

Using VapourSynth with super resolution and interpolation models and speeding them up with TensorRT.
BSD 3-Clause "New" or "Revised" License
288 stars 30 forks source link

clarification: multiple vapoursynth builds in dockerfile + additional vps plugins #68

Closed abcnorio closed 7 months ago

abcnorio commented 7 months ago

Hij,

just want to add some more vs plugins to the Dockerfile and wondered why vapoursynth seems to be built several times. What's the deeper meaning here? Only for the build of ffmpeg, etc.? For this a special vps version is requried (R66)? But ffmpeg is built 2x as well, right? Ie. this cannot be done in one context?

Is it the proper assumption to add more vs plugins in this paragraph ie. what is below "FROM ubuntu:22.04 as base":

########################
# vs plugins
[...]

?

Then - is it a problem to add a self-fine-tuned real-ersgan model? The goal is to upscale S-VHS material at least to 720p - mostly one person talking (half-close-up, but this changes). Photos are present for training. Do you have any hint whether there is a way to convert hi-res images to look like S-VHS material while downscaling them for training so that the ML algorithms have a good source for training?

btw - OS is *nix (bullseye, bookworm, ...).

Thanks and best!

styler00dollar commented 7 months ago

I compile the ffmpeg binary in arch because it is easier to set up uptodate dependencies for encoders and decoders. I thought of going away from arch, but didn't do so yet. Probably will stay with arch. https://github.com/styler00dollar/VSGAN-tensorrt-docker/blob/a266579d6b8be6f872234bdbbd81c4ea6507d38c/Dockerfile#L168

Here I compile ffmpeg for bestsource. It only needs decoders since bestsource is only decoding. https://github.com/styler00dollar/VSGAN-tensorrt-docker/blob/a266579d6b8be6f872234bdbbd81c4ea6507d38c/Dockerfile#L348

Here I compile ffmpeg for lsmash, it needs a different ffmpeg version or it will fail compile. https://github.com/styler00dollar/VSGAN-tensorrt-docker/blob/a266579d6b8be6f872234bdbbd81c4ea6507d38c/Dockerfile#L371

I even mentioned the error message in the docker file. https://github.com/styler00dollar/VSGAN-tensorrt-docker/blob/a266579d6b8be6f872234bdbbd81c4ea6507d38c/Dockerfile#L367

Due to that, vapoursynth is once built for the ffmpeg binary. https://github.com/styler00dollar/VSGAN-tensorrt-docker/blob/a266579d6b8be6f872234bdbbd81c4ea6507d38c/Dockerfile#L52

And once for actually using vapoursynth. https://github.com/styler00dollar/VSGAN-tensorrt-docker/blob/a266579d6b8be6f872234bdbbd81c4ea6507d38c/Dockerfile#L629

I tried to have seperation in my docker steps which isn't always easy, to have parallel building to reduce building time. There is no deeper meaning between using vapoursynth git master and git release nowadays, probably did that at some point due to compile errors on git master and never went back to git master for both.

You can access mounted folders, if your model is there you can access it within docker. I don't train VHS models so I can't say much for that one. You can use my code styler00dollar / Colab-traiNNer or muslll / neosr to train models.

Add more plugins here.

https://github.com/styler00dollar/VSGAN-tensorrt-docker/blob/a266579d6b8be6f872234bdbbd81c4ea6507d38c/Dockerfile#L662

And copy the so file in the final step. https://github.com/styler00dollar/VSGAN-tensorrt-docker/blob/a266579d6b8be6f872234bdbbd81c4ea6507d38c/Dockerfile#L817

abcnorio commented 7 months ago

Thanks, that's very helpful. I forked a vapoursynth docker build found on gitlab (see my repo here on github) and added some more plugins, so this looks good and I will try to add them here as well. Most plugins are needed for qtgmc/ deinterlacing, and some other useful stuff like imagemagick or mkvtoolnix.

Yes saw the error message regarding ffmpeg/lsmash, just wanted to be sure.

abcnorio commented 7 months ago

...one question came up - size of pictures to train/ fine-tune upscaling models - the base are 512x512, correct? So one can use e.g. 'convert' from imagemagick to create tiles based on hi-res images and use them for training? That would be a proper approach? That way one can use the full resolution without running out of VRAM/memory while training?

Or what is a good training resolution of an image? At the moment I use a RTX3060 12GB but good chance a RTX4080 16GB can be used in future, but nothing above.

thanks and best