cgohlke / vidsrc

Video frameserver for Numpy
https://pypi.org/project/vidsrc
BSD 3-Clause "New" or "Revised" License
1 stars 0 forks source link

Features, performance and comparison with other interfaces (MSMF, OpenCV) #1

Closed e-d-n-a closed 3 years ago

e-d-n-a commented 3 years ago

It's nice, that this binding with a neat interface exists and is still maintained, but how does it compare to more modern options?

DirectShow seems pretty outdated now and, as I understand it, doesn't provide as much support for containers and codecs out-of-the-box as the more modern Media Foundation (MSMF) does.

I checked the corresponding articles at Wikipedia to see what I should be able to read via this binding. And I can only expect the DirectShow-features as they are totally separate from the MSMF-Interfaces, right!?

Features "DirectShow" [Wiki]:

Video-Containers: MPEG-1, WMV, AVI, ASF

Codecs: ??? mostly by extending through 3rd parties*!? Filter/Codec selection can become unclear with multiple alternatives available!

*"[...] Ogg Vorbis, Musepack, and AC3, and some codecs such as MPEG-4 Advanced Simple Profile, AAC, H.264, Vorbis and containers MOV, MP4 are available from 3rd parties like ffdshow, K-Lite, and CCCP"

Possible HEVC/H.265 codec from Microsoft for DirectX (?)

Features "Media Foundation" [Wiki]:

Video-Containers: 3GP, AVI, M4V, MJPEG, MOV, MP4, MP4V, WMV

Codecs: H.264, AAC-LC, HE-AAC v1, HE-AAC v2, MPEG-4 Part 2 ..., MJPEG, DV

So MSMF seems to offer all the common formats out-of-the-box and already organized since Win7, while DirectShow probably needs 3rd-party codec pack(s) to support any common set of formats apart from WMV (and hopefully without conflicts!).

So in which scenarios makes this binding still sense in this day and age? Is it still here just due to its simplicity, while not providing the complete solution!? How do the frameworks differ in performance and reliability in practical applications? Any sharable experience, hints and recommendations - as I couldn't find any documentation beyond the code example.

Just want to figure out, if this just exists as a relic for backwards compatibility with solutions in the realm of DirectX(9).

Also OpenCV offers a frame-based interface (but only sequentially afaik) and connects to various backends on multiple platforms. System Overview (v4.5.1) List of Backends* (v4.5.1) Basic Python-Example code

*including DirectShow, MSMF, FFMPEG, GStreamer etc.

Would it be possible to adapt this kind of simple interface with basic media info and random frame access to the MSMF framework instead!?

e-d-n-a commented 3 years ago

Supported Formats in DirectShow (MS Docs):

DirectShow is an open architecture, which means that it can support any format as long as there are filters to parse and decode it. The filters provided by Microsoft, either as redistributables through DirectShow or as Windows operating system components, provide default support for the following file and compression formats.

File types:

Supported Formats in DirectShow File Type | More Information

Advanced Systems Format (ASF),
including Windows Media Audio (WMA) 
and Windows Media Video (WMV)       WM ASF Reader Filter
WM ASF Writer Filter
AIFF                    WAVE Parser Filter
AU                  WAVE Parser Filter
Audio-Video Interleaved (AVI)       AVI Mux Filter
AVI Splitter Filter
MIDI                    MIDI Parser Filter
MIDI Renderer Filter
SND     
WAV                 WAVE Parser Filter

Compression formats:

Table 2 Format | More Information

AAC                             Microsoft MPEG-1/DD/AAC Audio Decoder
Cinepak     
Digital Video (DV)                  DV Video Decoder Filter
DV Video Encoder Filter
H.264                           Microsoft MPEG-2 Video Decoder
ISO MPEG-4 video version 1.0    
Microsoft MPEG-4 version 3  
MJPEG                           MJPEG Compressor Filter
MJPEG Decompressor Filter
MPEG Audio Layer-3 (MP3) (decompression only)   
MPEG-1 layer I and layer II audio           Microsoft MPEG-1/DD/AAC Audio Decoder
MPEG-1 Audio Decoder Filter
MPEG-1 video                        MPEG-1 Video Decoder Filter
Microsoft MPEG-2 Video Decoder
MPEG-2 audio                        Microsoft MPEG-2 Audio Encoder
Microsoft MPEG-2 Encoder
MPEG-2 video                        Microsoft MPEG-2 Encoder
Microsoft MPEG-2 Video Decoder
Microsoft MPEG-2 Video Encoder

For information on the availability of particular third-party codecs for redistribution with DirectShow applications, contact the codec manufacturer.

cgohlke commented 3 years ago

Thanks for all the information. This package was developed in 2006, specifically to read AVI files compressed with the DivX/Xvid codec. The DirectShow IMediaDet interface is now deprecated and so is this package. I have no plan to support MSMF.

https://github.com/cgohlke/vidsrc/blob/5519758c9f9423146773771de6d1b630984b839f/vidsrc/vidsrc.cpp#L63-L67

e-d-n-a commented 3 years ago

So what is usually missing is mostly support for -

File Formats:

MP4
3GP(P)
MKV/WEBM
F4V/FLV
MTS/M2TS/TS
OGG/OGV
GIF
RM
VOB?

Codecs:

MPEG-4 Part 2?
HEVC (H.265)
VP8/VP9/AV1
Theora
RealVideo
VC-1
MJ2

(see Video file format @ Wiki for a list of common container formats and supported codecs)

WMPv12 - File Format List (probably using MSMF, see Wiki):

TV program recorded by Microsoft (*.dvr-ms;*.wtv)
Media playlist (*.asn;*.wan;*.m3u;*.wp,*.wyn;*.wmn;*.search-ms)
MIDI File (MIDI) (*.mid;*.rmi;*.midi)
MP4 Audio File (*.m4a)
MP4 Video File (*.mp4;*.m4v;*.mp4v;*.3gp;*.3gpp;*.3g2;*.3gp2)
MPEG -2 TS Video File (*.m2ts;*.m2t;*.mts;*.ts;*.tts)
QuickTime Video File (*.mov)
Video File (MPEG) (*.mpeg;*.mpg;*.mle;*.m2v;*.mod;*.mpar.mpg;*.ifor.vob)
Windows Media File (ASF) (*.asf;*.wm;*.wma;*.wmv;*.wmd)
Windows Audio File (WAV ) (*.wav;*.snd;*.au;*.aif;*.aifc;*.aiff;*.wma;*.mp2;*.mp3;*.adts;*.adt;*.aac)
Windows Image File (JPG) (*.jpg;*.jpeg)
Windows Video File (AVI) (*.avi;*.wmv) 

WMPv12 - Video Codec List (via Help/About/Technical Support Information):

ICM | Microsoft RLE | MRLE | msrle32.dll | 6.1.7601.17514
ICM | Microsoft Video 1 | MSVC | msvidc32.dll | 6.1.7601.17514
ICM | Microsoft YUV | UYVY | msyuv.dll | 6.1.7601.17514
ICM | Intel IYUV Codec | IYUV | iyuv_32.dll | 6.1.7601.17514
ICM | Logitech Video (I420) | i420 | lvcodec2.dll | 13.51.823.0
ICM | Toshiba YUV Codec | Y411
ICM | Cinepak Codec von Radius | cvid | iccvid.dll | 1.10.0.13
DMO | Mpeg4s Decoder DMO | mp4s, MP4S, m4s2, M4S2, MP4V, mp4v, XVID, xvid, DIVX, DX50 | mp4sdecd.dll | 6.1.7601.19091
DMO | WMV Screen decoder DMO | MSS1, MSS2 | wmvsdecd.dll | 6.1.7601.19091
DMO | WMVideo Decoder DMO | WMV1, WMV2, WMV3, WMVA, WVC1, WMVP, WVP2 | wmvdecod.dll | 6.1.7601.19091
DMO | Mpeg43 Decoder DMO | mp43, MP43 | mp43decd.dll | 6.1.7601.19091
DMO | Mpeg4 Decoder DMO | MPG4, mpg4, mp42, MP42 | mpg4decd.dll | 6.1.7601.19091
e-d-n-a commented 3 years ago

Thanks for all the information. This package was developed in 2006, specifically to read AVI files compressed with the DivX/Xvid codec. The DirectShow IMediaDet interface is now deprecated and so is this package. I have no plan to support MSMF.

https://github.com/cgohlke/vidsrc/blob/5519758c9f9423146773771de6d1b630984b839f/vidsrc/vidsrc.cpp#L63-L67

OK, sad to hear that, as I stumbled upon this package and I'm curious about a (working) simple solution to access video frames of various video file formats using Python, that also allows (fast) seeking. Well, I reckoned that this is a pretty outdated approach, but at least it was mentioned (in issue below), that DirectShow is precise and probably fast for frame access.

OpenCV and maybe FFMPEG still seem to have issues in some cases (B-frames, variable FPS, certain containers, DTS-based seeking) to provide precise and fast access to video frames.

https://github.com/opencv/opencv/issues/9053 (open)

I couldn't find a package, that offers direct bindings to MSMF (besides OpenCV) for Python. It's also not clear to me rn, how to use OpenCV with FFMPEG-support without the need to compile it accordingly myself! (has your OpenCV-package for Python on Windows FFMPEG-support btw?)

Are you aware of other Python-packages for video frame-access (time-based), that are not based on OpenCV, FFMPEG or GStreamer or at least offer simple (static) bindings to avXXX.dlls (avformat, avcodec) on Windows?

It's probably only wise to look for a time-based (PTS) access as frame numbering is often not standardized and precise. I also noticed, that DirectShow actually is time-based as can be seen here: https://docs.microsoft.com/en-us/windows/win32/directshow/imediadet-getbitmapbits https://docs.microsoft.com/en-us/windows/win32/directshow/imediadet-enterbitmapgrabmode

e-d-n-a commented 3 years ago

Well, stupid me should have read through the opencv-issue just a bit further, as I now stumbled upon

ffms2 – Python bindings for FFmpegSource

which seems to offer exactly the solution I was looking for! :D

cgohlke commented 3 years ago

Glad you found a more modern library.