NevermindNilas / TheAnimeScripter

Welcome to TheAnimeScripter – the ultimate tool for Video Upscaling, Interpolating and many more. Available as a CLI, GUI and Adobe Extension.
GNU Affero General Public License v3.0
95 stars 1 forks source link

Interpolated video has incorrect frame sequence #46

Closed GoldJohnKing closed 2 months ago

GoldJohnKing commented 2 months ago

Hi, I am interpolating a real-world footage, and the frame sequence appears incorrect.

TAS version is v1.9.3.

main.exe --input %input_video% --output %output_video% --interpolate --interpolate_factor %interpolate_factor% --interpolate_method rife4.22-tensorrt --ensemble --encode_method nvenc_h265

I can confirm non-tensorrt rife behaves well.

Here is the interpolating result:

https://github.com/user-attachments/assets/510e88ea-5d93-4792-ac0b-30913486c3a9

Please pay attention to the frame index at the top left corner.

NevermindNilas commented 2 months ago

Hi, could you try the same settings but instead of using nvenc encoder, could you default to some CPU encoder like the default one "x264" ?

GoldJohnKing commented 2 months ago

Hi, could you try the same settings but instead of using nvenc encoder, could you default to some CPU encoder like the default one "x264" ?

Hi, I can confirm setting encoder to x264 also solves the issue.

However, x265 remains the same.

NevermindNilas commented 2 months ago

Hi, could you try the same settings but instead of using nvenc encoder, could you default to some CPU encoder like the default one "x264" ?

Hi, I can confirm setting encoder to x264 also solves the issue.

However, x265 remains the same.

It's very likely a Cuda Race condition. I will take a look, could you share your system specs and the log.txt generated in {user}/Appdata/roaming/TheAnimeScripter specifically with the encode method you had issues with.

NevermindNilas commented 2 months ago

Update:

I have managed to replicate this issue, It seems like CPU usage for some reason creates some race condition within TAS, I also don't experience any issues with x264 / nvenc h264 / qsv h264 but see major degradation when using more ' advanced ' encoders like x265 / nvenc h265 / qsv h265 and av1. This happens over time when the CPU hits 100% load.

!FLASHING WARNING! https://github.com/user-attachments/assets/f5423b33-d471-4e11-9628-e88e771d6672

I will try to check if this is specifically FFMPEG related and if using a official release like 7.0.2 would work: Nope, still issues.

Trying 6.1.1: Nope, still issues.

Will try to play around with the commands in a few and see if there's any magical ffmpeg flag that would help with this.

NevermindNilas commented 2 months ago

One potential fix for now is using --custom_encoder and parsing some low thread count -threads 2. This does seem to partially fix it by simply removing the amount of resources the 2nd FFMPEG instance ( the writing instance ) uses.

I will try to see if I can somehow hack together a way to use a single instance of FFMPEG for both reading and writing, that should potentially lower CPU resources further more.

NevermindNilas commented 2 months ago

The threads fix is not reliant.

Tested TAS 1.8.0, 1.8.3 and 1.9.0 all of these repeat the same pattern.

I tested upscaling with cuda and tensorrt all seem fine with x265 / av1 / nvenc h265 Interpolation with CUDA and NCNN seem fine as well with x265 / av1 / nvenc h265

As far as I am concerned the issue at hand is a race condition when TAS caches frames specifically with TensorRT. This only gets triggered with very high CPU usages and more moderate CPU usages like 50-70% don't cause any issues.

Looking for a fix.

NevermindNilas commented 2 months ago

Removing no copy operations like squeeze_ and mul_ and instead using squeeze and mul seem to have fixed the issue for me with rife4.18-tensorrt and av1,

Testing x265 and NVENC with both rife4.18-tensorrt and 4.22-tensorrt

https://github.com/user-attachments/assets/f59732c8-2399-4657-98c3-b89e5021ac89

NevermindNilas commented 2 months ago

Tested "rife4.18-tensorrt, rife4.20-tensorrt and rife4.22-tensorrt" with x265 / nvenc_h265 / av1 and qsv_h265. All seem to work fine now.

It seems to have been specifically triggered by the operator mul_ , at some higher CPU usage it just screwed with the torch cuda stream leading in CUDA Race conditions. This is not a FFMPEG issue and it is likely either a pytorch issue or my implementation of it.

Should be fixed with https://github.com/NevermindNilas/TheAnimeScripter/commit/3f2a843309455aeccc442106aec1307c48980216

A Nightly build should trigger within a couple of hours at: https://github.com/NevermindNilas/TAS-Nightly And you will be able to test it yourself and let me know if it was fixed or not for you.

Thanks for the report.

GoldJohnKing commented 2 months ago

Huge thanks to your effort!

I have tested rife4.22-tensorrt and span-tensorrt, it works well on the latest nightly build.

However, maxxvit-tensorrt still triggers one flash-back frame at scene changes.

NevermindNilas commented 2 months ago

Hi,

That's on me, I've accidentally removed a core functionality of Rife TensorRT without noticing.

Command:

python .\main.py --input .\input\input.mp4 --interpolate --interpolate_method rife4.22-tensorrt --encode_method x265 --scenechange --scenechange_method maxxvit-tensorrt --outpoint 10

https://github.com/user-attachments/assets/30e6d113-242d-4e8e-b1b8-a06767345c12

A new TAS Build should be triggered tomorrow morning. U can use rife4.22 in the meantime as a temporary fix.