Open adworacz opened 4 years ago
Hey @adworacz;
This is amazing, thank you for finding this. Could you explain a bit more about how your solution works / point me to a specific commit? Is the issue here just that the duration needs to be specified?
Glad to know it works for the multiple audio track issue as well. I'm not that familiar with VaporSynth but I'll definitely read up on it, looks very useful.
My only concern is that, looking at the OpenCV API, there is no clear way for this to work without using a named pipe. I would love to support this natively, but it might require switching away from using OpenCV to do the video I/O. Do you know of any ways to use the OpenCV VideoCapture API to accomplish this?
Thank you!!!
Questions from other thread:
Hey @adworacz;
A few questions for you:
Do you need to specify the duration because the stream is never ending? Or is it because OpenCV cannot find the duration of the named pipe?
In your example, does it actually output the entire video to the file, or does the file act like a stream/pipe itself?
I'd love to support this, but currently am using OpenCV under the hood is why I ask. There doesn't seem to be a way to allow passing video data into the OpenCV VideoCapture API at a high level, but I can do some investigation to see if there's a painless way to perhaps modify the internals of it. My concern is that this might drive a fundamental change to PySceneDetect for how video I/O is done to use a lower level library directly like ffmpeg. I'm willing to support that change, I just can't commit to any specific times for when I would be able to do it unfortunately.
Also good catch with the progress bar - that definitely is a bug, looks like I assume I know the video length. I need to allow a mode for when the video length is unknown to avoid that issue. Do you think that's related to my first question?
Thanks!
mkfifo
command on my Linux machine. It should be available on pretty much every distro.Re:OpenCV API - I agree with your assessment. I did some of my own digging (like 10 minutes, so add salt to taste), and their API didn't seem built to easily accept something like a buffer or a stream of bits. So this may be a limitation of OpenCV more than anything else. I've no idea how they handle development conversations, so it may be worth opening a bug/feature request/issue on any public tracker that they may have.
Re: Progress bar - yup, I'd wager that the progress bar code doesn't handle the case where the number of frames need to be passed in. I haven't looked at the code so I'm not sure how OpenCV is reporting how many frames have been processed.
@adworacz in v0.6 there is now integration with PyAV which allows passing a BytesIO
object directly. This should make this feasible now. It should also handle the case of not finding the video duration much more gracefully now. Unfortunately I don't have a milestone for this one yet, but the tasks that would need to be done are:
ffmpeg [...] | python test.py
)--stdin
to enable this mode (technically cannot use -
since that can be a filename)split-video
min_scene_len
as a buffer, and won't work with two-pass algorithms like detect-adaptive
, so need to error out for that (make a way for detectors to report if they are "online"/one-pass or "offline"/two-pass algorithms) save-images
, and set --no-images
on export-html
P.S. The PyAV backend also allows processing videos with multiple audio tracks without any issues :)
Edit: Related: PyAV-Org/PyAV#738
Description of Problem & Solution It would be great if PyScenedetect could support input via pipe.
I've actually already proven that it's possible, and is mostly just an API change, see here: https://github.com/master-of-zen/Av1an/issues/173#issuecomment-706457867
I used a named pipe in this example, but a named pipe and specifying duration in frames was all that was needed to read a piped video stream in Y4M format.
The CLI could follow the standard pattern of
ffmpeg ... - | scenedetect -i -
, where-
is used to indicate that input should be read from stdin.Media Examples: Any media will work, generally piped via ffmpeg using the "-f yuv4mpegpipe" format command.
Proposed Implementation: Aside from supporting just input files, Pyscenedetect will support raw (or more likely Y4M) input streams instead of requiring an actual file.
This also makes working around the issue with "multiple audio tracks are not supported" easier, and doesn't force the user to demux the video from the audio tracks (which can save a LOT of disc space for high res content).
Alternative Solutions: Adding documentation that demonstrates the use of named pipes on multiple platforms (this means Unix AND Windows).
That alternative is okay, but being able to treat PyScenedetect as a sink in a pipe creates a lot of flexibility.