Closed jensdraht1999 closed 1 year ago
Something is not working as it worked on the day I posted. It really is not improving performance setting the number more than thread available. Must look at: -Spectre, Meltdown, Downfall.
@HolyWu
Another test:
18:32 with script3.py/script4.py/script5.py and 3 bat files. First it gets split into three parts, then it interpolates, then it merges. This was a naive approach by the way, if all videos would have been equally split. It might have been a little bit faster. Num_streams was 5 for each script.
18:43 with script2.py with numstream 12. Just scaling and merging audio and video together.
18:35 with script3.py/script4.py/script5.py and 3 bat files. First it gets split into three parts, then it interpolates, then it merges. This was equally split. Num_streams was 5 for each script.
So this means, even if cuda is utilized 100% it does not get any faster in any meaninful way.
The Video: 720p with 23:42 runtime 23.974 FPS upscaled x3. The Hardware: I5-10500H Laptop with 6 Cores / 12 threads. Nvidia RTX 3060 with 6144mb.
So the good news is, that it does not get any faster. This is pretty much the limit on how fast it goes.
@HolyWu
I commented out:
and then set the number of streams to 24. It gave me 125-135 Frames per second, instead of 115 Frames per second with 12 streams. If I set it to 25/26 streams, it's getting very slow like 55fps, because of the new cuda fallback policy, which literally would end the program instead of working slowly.
My hardware: I5-10500H Laptop with 6Cores / 12 threads. RTX 3060 with 6144mb
The video I have tested, just the first minute: 720p anime video.
Suggestion:
I will try to look up with the help ai coding tools, if we can slowly increase the number of streams up until the point, where the vram is full with a 10% percent margin. So we might get the biggest boost. I do not think we should set a default num_stream number or at least set it to 1, because, the numofstreams value may change with the resoultion of the video.
A 4k video is 9 times bigger than a 720p video, which means, you cannot have "numofstreams=24", which would not fit into the memory. A dynamic approach to this would be better.
I will close this issue, since this is just a documentation for me and if you think, this is important for you.