styler00dollar / VSGAN-tensorrt-docker

Using VapourSynth with super resolution and interpolation models and speeding them up with TensorRT.
BSD 3-Clause "New" or "Revised" License
286 stars 30 forks source link

Out of memory Error when loading .engine upon starting. #22

Closed Anon1337Elite closed 1 year ago

Anon1337Elite commented 1 year ago

It doesn't even start, but it spits out an Out of Memory error, trying to use 4x-AnimeSharp

1: [defaultAllocator.cpp::allocate::20] Error Code 1: Cuda Runtime (out of memory) Requested amount of GPU memory (4541645312 bytes) could not be allocated. There may not be enough free memory for allocation to succeed. 2: [executionContext.cpp::ExecutionContext::409] Error Code 2: OutOfMemory (no further information) pipe:: Invalid data found when processing input

styler00dollar commented 1 year ago

Get a better gpu, use a more lightweight model, lower resolution, make sure that num_streams is not too big or use tiling. Warnings can happen, but if engine fails to build and you have no .engine, then you run into hardware limitations and you need to build with lower resolution. If you do have a .engine, then you most likely have too many num_streams. Do one of the 5 suggested things. Making a fp16 engine may help as well. What you use is an esrgan model, which is known to be memory hungry.

Anon1337Elite commented 1 year ago

Get a better gpu, use a more lightweight model, lower resolution or use tiling. Warnings can happen, but if engine fails and you have no .engine, then you run into hardware limitations. Do one of the 4 suggested things. Making a fp16 engine may help as well. What you use is an esrgan model, which is known to be memory hungry.

Thanks for the reply, Do you mind looking at FFmpeg, dunno if it is outdated or it is not full, is there any way to update it within with some command.

Problem is, currently libx265 doesn't support 10bit for some reason when it should. Here is a side by side picture from the one i have on my Windows & Docker. I think this might be related to the other issue i posted as well.

WindowsTerminal_tuxM4jdu98

styler00dollar commented 1 year ago

I compile ffmpeg nearly weekly and you can see how I compile it here. I guess i can recompile with -D HIGH_BIT_DEPTH:BOOL=ON later. Please make different issues for different problems.

Anon1337Elite commented 1 year ago

I compile ffmpeg nearly weekly and you can see how I compile it here. I guess i can recompile with -D HIGH_BIT_DEPTH:BOOL=ON later. Please make different issues for different problems.

Sorry, I will open an issue about it, please close it once it is updated. Thank you very much. :D

styler00dollar commented 1 year ago

One random thing I remembered, num_streams also influences vram usage. Keep it at 1 or 2 if you face issues. Editing comment above to add that.

Anon1337Elite commented 1 year ago

One random thing I remembered, num_streams also influences vram usage. Keep it at 1 or 2 if you face issues. Editing comment above to add that.

Sorry hijacking this thread again, wanted to mention also the commands "bframes=8:psy-rd=1:aq-mode=3" are not working which they should with x265

styler00dollar commented 1 year ago

ffmpeg -i input.mp4 -vcodec libx265 -x265-params "bframes=8:psy-rd=1:aq-mode=3" output.mp4 works fine.

x265 [info]: HEVC encoder version 3.5+86-6da609e41
x265 [info]: build info [Linux][GCC 12.2.1][64 bit] 8bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
x265 [info]: Main profile, Level-4 (Main tier)
x265 [info]: Thread pool created using 16 threads
x265 [info]: Slices                              : 1
x265 [info]: frame threads / pool features       : 4 / wpp(17 rows)
x265 [info]: Coding QT: max CU size, min CU size : 64 / 8
x265 [info]: Residual QT: max TU size, max depth : 32 / 1 inter / 1 intra
x265 [info]: ME / range / subpel / merge         : hex / 57 / 2 / 3
x265 [info]: Keyframe min / max / scenecut / bias  : 24 / 250 / 40 / 5.00
x265 [info]: Lookahead / bframes / badapt        : 20 / 8 / 2
x265 [info]: b-pyramid / weightp / weightb       : 1 / 1 / 0
x265 [info]: References / ref-limit  cu / depth  : 3 / off / on
x265 [info]: AQ: mode / str / qg-size / cu-tree  : 3 / 1.0 / 32 / 1
x265 [info]: Rate Control / qCompress            : CRF-28.0 / 0.60
x265 [info]: tools: rd=3 psy-rd=1.00 early-skip rskip mode=1 signhide tmvp
x265 [info]: tools: b-intra strong-intra-smoothing lslices=6 deblock sao

Now please stop highjacking issues, since that will make it harder to find solutions for others. Referencing https://github.com/styler00dollar/VSGAN-tensorrt-docker/issues/23 since you asked the same there.