Breakthrough / PySceneDetect

:movie_camera: Python and OpenCV-based scene cut/transition detection program & library.
https://www.scenedetect.com/
BSD 3-Clause "New" or "Revised" License
3.29k stars 398 forks source link

[Feature Request] Support input via pipe #189

Open adworacz opened 4 years ago

adworacz commented 4 years ago

Description of Problem & Solution It would be great if PyScenedetect could support input via pipe.

I've actually already proven that it's possible, and is mostly just an API change, see here: https://github.com/master-of-zen/Av1an/issues/173#issuecomment-706457867

I used a named pipe in this example, but a named pipe and specifying duration in frames was all that was needed to read a piped video stream in Y4M format.

The CLI could follow the standard pattern of ffmpeg ... - | scenedetect -i -, where - is used to indicate that input should be read from stdin.

Media Examples: Any media will work, generally piped via ffmpeg using the "-f yuv4mpegpipe" format command.

Proposed Implementation: Aside from supporting just input files, Pyscenedetect will support raw (or more likely Y4M) input streams instead of requiring an actual file.

This also makes working around the issue with "multiple audio tracks are not supported" easier, and doesn't force the user to demux the video from the audio tracks (which can save a LOT of disc space for high res content).

Alternative Solutions: Adding documentation that demonstrates the use of named pipes on multiple platforms (this means Unix AND Windows).

That alternative is okay, but being able to treat PyScenedetect as a sink in a pipe creates a lot of flexibility.

Breakthrough commented 4 years ago

Hey @adworacz;

This is amazing, thank you for finding this. Could you explain a bit more about how your solution works / point me to a specific commit? Is the issue here just that the duration needs to be specified?

Glad to know it works for the multiple audio track issue as well. I'm not that familiar with VaporSynth but I'll definitely read up on it, looks very useful.

My only concern is that, looking at the OpenCV API, there is no clear way for this to work without using a named pipe. I would love to support this natively, but it might require switching away from using OpenCV to do the video I/O. Do you know of any ways to use the OpenCV VideoCapture API to accomplish this?

Thank you!!!

adworacz commented 4 years ago

Questions from other thread:

Hey @adworacz;

A few questions for you:

    Do you need to specify the duration because the stream is never ending? Or is it because OpenCV cannot find the duration of the named pipe?
    In your example, does it actually output the entire video to the file, or does the file act like a stream/pipe itself?

I'd love to support this, but currently am using OpenCV under the hood is why I ask. There doesn't seem to be a way to allow passing video data into the OpenCV VideoCapture API at a high level, but I can do some investigation to see if there's a painless way to perhaps modify the internals of it. My concern is that this might drive a fundamental change to PySceneDetect for how video I/O is done to use a lower level library directly like ffmpeg. I'm willing to support that change, I just can't commit to any specific times for when I would be able to do it unfortunately.

Also good catch with the progress bar - that definitely is a bug, looks like I assume I know the video length. I need to allow a mode for when the video length is unknown to avoid that issue. Do you think that's related to my first question?

Thanks!
  1. "specify stream duration" - It does technically have an end, which programs like x264/x265 are able to handle just fine when piped to directly. I'm not sure if its a function of named pipes, but the only way I could get PyScenedetect to handle things properly is when I manually specified the number of frames. I suspect (as you call out) that OpenCV etc cannot determine the number of frames or EOF from the data sent over the named pipe alone.
  2. "entire file" - In the commit I linked above (which you're referring to here), its the latter. The file acts as a stream itself. I'm using a feature of Linux called "named pipes" (and likely Unix in general, and I've seen references to a similar feature on Windows, albeit it appears to be lacking from any API support in the Python core lib).
    1. It doesn't take much to get this off the ground. I bootstrapped the idea while messing around by using the mkfifo command on my Linux machine. It should be available on pretty much every distro.

Re:OpenCV API - I agree with your assessment. I did some of my own digging (like 10 minutes, so add salt to taste), and their API didn't seem built to easily accept something like a buffer or a stream of bits. So this may be a limitation of OpenCV more than anything else. I've no idea how they handle development conversations, so it may be worth opening a bug/feature request/issue on any public tracker that they may have.

Re: Progress bar - yup, I'd wager that the progress bar code doesn't handle the case where the number of frames need to be passed in. I haven't looked at the code so I'm not sure how OpenCV is reporting how many frames have been processed.

Breakthrough commented 2 years ago

@adworacz in v0.6 there is now integration with PyAV which allows passing a BytesIO object directly. This should make this feasible now. It should also handle the case of not finding the video duration much more gracefully now. Unfortunately I don't have a milestone for this one yet, but the tasks that would need to be done are:

P.S. The PyAV backend also allows processing videos with multiple audio tracks without any issues :)

Edit: Related: PyAV-Org/PyAV#738