carykh / jumpcutter

Automatically edits vidx. Explanation here: https://www.youtube.com/watch?v=DQ8orIurGxw
MIT License
3.08k stars 541 forks source link

Increase speed by parallel execution #16

Open balsoft opened 5 years ago

balsoft commented 5 years ago

We can speed up the process if we do all three steps at the same time - first ffmpeg process creates sound and images (sound should be in some sort of streaming container), python stuff does it's job frame by frame and feeds those frames to the second ffmpeg. This shouldn't be too hard to implement with pipes.

carykh commented 5 years ago

That's true, and I definitely think this would help more for really long videos. It might even work on livestreamed videos! (Although.... that doesn't really make sense since the timeline is cut.) Anyway, I'll definitely work on piping the processes together when I get some free time again.

(One thing to note is that the python stuff (analyzing the audio) happens much faster than ffmpeg. So as the parallelized code is running, there will essentially be the two ffmpeg processes churning away, and the Python process waiting for 95% of the time.)

abhiTronix commented 5 years ago

@balsoft @carykh I can parallelize this code by implementing this code in OpenCV with my Vidgear python library. I'm currently working on it!

nevercast commented 5 years ago

I suspect this process, if anything, is IO limited by writing images out to storage. Parallel execution wont help much if we are waiting for IO to complete. If we can only write to RAM this may be a much faster process. Streaming it may be the right move here.

balsoft commented 5 years ago

I suspect this process, if anything, is IO limited by writing images out to storage. Parallel execution wont help much if we are waiting for IO to complete. If we can only write to RAM this may be a much faster process. Streaming it may be the right move here.

Sorry to say that, but you're probably wrong on two accounts. 1) Parallel execution can massively speed up IO-heavy operations by executing many IO ops in parallel. Modern SSDs and HDDs can handle many IO operations at the same time. 2) FFMPEG isn't that much IO-limited, it's mostly encoding/decoding video streams from/into pictures and sound, which is a CPU/GPU-heavy task.

nevercast commented 5 years ago

I'm happy to be wrong on this case, means more performance to gain. I can do tests on this later to get solid numbers on this. My comment was speculation, my mistake if I gave the impression it was factual.

Parallel execution can massively speed up IO-heavy operations by executing many IO ops in parallel. Modern SSDs and HDDs can handle many IO operations at the same time.

Sure, but not concurrently. Only one block is written at a time, or is my knowledge dated?

FFMPEG isn't that much IO-limited, it's mostly encoding/decoding video streams from/into pictures and sound, which is a CPU/GPU-heavy task.

I suspected that the writing of frames was the slowest part, but I don't have evidence of this.

balsoft commented 5 years ago

Only one block is written at a time, or is my knowledge dated?

The matter is very complicated, but in general it's fine to assume that modern drives can do more than one page or block write at a time.

I suspected that the writing of frames was the slowest part, but I don't have evidence of this.

Even if this is the case, piping will speed up the process by eliminating "write-to-the-disk" operation entirely.

nevercast commented 5 years ago

The matter is very complicated, but in general it's fine to assume that modern drives can do more than one page or block write at a time.

Reasonable assumption

Even if this is the case, piping will speed up the process by eliminating "write-to-the-disk" operation entirely.

That's what I'm going for, stream/pipe the whole process.

Tinguely commented 5 years ago

@carykh

In response to live-streaming:

As an aside to write speed, I think the benefit to a purely in memeory implementation is that you could then stitch in memory instead of offloading to ffmpeg. I'm going to rewrite this in go, I'm pretty sure you could then preform this live with a reasonable buffer instead of having to parse then restitch to view. A buffer would allow the process to run in a way where audio and video is preprocessed a few seconds ahead, since the video doesn't care about what happened before and after u could apply an algorithm as such:

Define i = 0 Calculate avg audio output for given chunk X (a predetermined Length of audio equal to 1 frame) If X avg audio value < threshold Then Remove audio (doesn't seem to be needed at 20x)

If i++ % 20 == 0 Then Added frame to video buffer End if

Else Apply same logic as above but keep audio, speed up audio for given frame, if the frame % speed up multiplyier is 0 then add frame to buffer but regardless add the sped up audio to buffer. End else

The advantages of this approach is it doesn't have to look ahead and the image/audio can be separated into two streams and never has to be encoded (unless being saved but it looks like you have that logic down). Another advantage to this approach is it has no disk write ops which will bottle beck unlike stated above. Hard drives have limited write speed, writing a ton of images is going to take time. I don't u derstand why there are other comments stating that wouldn't be the case, any modern cpu is going to be able to rip through images faster than they can be written.