PyAV-Org / PyAV

Pythonic bindings for FFmpeg's libraries.
https://pyav.basswood-io.com/
BSD 3-Clause "New" or "Revised" License
2.53k stars 366 forks source link

how to enable multithreaded decoding #232

Closed crackwitz closed 7 years ago

crackwitz commented 7 years ago

Hi!

I had previously picked up a precompiled PyAV for Windows via Anaconda (I think), which did seem to multithread the decoding (demux+decode) because I got ~150 fps on some Full HD camcorder footage. ffmpeg per default also does multithreaded decoding (ffmpeg -i foo.vid -f null - gets me ~220 fps).

However, now that I've built PyAV myself for either plain CPython 2.7 or 3.6, it seems to not decode using any threads (I get ~40-60 fps depending on what else I do in the loop). When I inspect the python process, it has a main thread only, so this isn't just using a single worker thread, but no threads at all.

What am I doing wrong? Do I need to tell PyAV something in the python script, or during build?

I've attached a build log (setup.py install), in case that helps.

build-py3.log.txt

mikeboers commented 7 years ago

I wonder if it is the change in context, or that I just did some rather substantial changes to the transcoding pipeline.

Could you try your test with a self built PyAV, but back from v0.3.3?

crackwitz commented 7 years ago

self-built v0.3.3 (and v0.3.2) didn't behave any differently.

I'd guess that some condition for threading support is not satisfied in my build setup. I just need to figure out which.

does this stack trace of the thread from some point within the decode cycle help?

ntoskrnl.exe!memset+0x64a ntoskrnl.exe!KeWaitForMultipleObjects+0xd52 ntoskrnl.exe!KeWaitForMutexObject+0x19f ntoskrnl.exe!_misaligned_access+0xbd4 ntoskrnl.exe!_misaligned_access+0x186d ntoskrnl.exe!IoFreeErrorLogEntry+0x287 avcodec-57.dll!avpriv_exif_decode_ifd+0x7985d avcodec-57.dll!avpriv_exif_decode_ifd+0xc657c avcodec-57.dll!avpriv_exif_decode_ifd+0xc7aa0 avcodec-57.dll!avpriv_exif_decode_ifd+0xcde43 avcodec-57.dll!avcodec_decode_video2+0x210 avcodec-57.dll!avcodec_decode_audio4+0xac0 avcodec-57.dll!avcodec_send_packet+0xac context.cp36-win_amd64.pyd!PyInit_context+0x5d0e context.cp36-win_amd64.pyd!PyInit_context+0x39f4 stream.cp36-win_amd64.pyd!PyInit_stream+0x28af stream.cp36-win_amd64.pyd!PyInit_stream+0x2afa python36.dll!PyCFunction_Call+0xd7 packet.cp36-win_amd64.pyd+0x39df packet.cp36-win_amd64.pyd!PyInit_packet+0x24b7 packet.cp36-win_amd64.pyd!PyInit_packet+0x271a python36.dll!PyCFunction_FastCallDict+0x305 python36.dll!PyObject_GenericGetAttr+0xa3 python36.dll!PyEval_EvalFrameDefault+0x3bf python36.dll!PyErr_Occurred+0x1aa python36.dll!PyEval_EvalCodeEx+0x8e python36.dll!PyEval_EvalCode+0x2d python36.dll!PyArena_Free+0xa7 python36.dll!PyRun_FileExFlags+0xb5 python36.dll!PyRun_SimpleFileExFlags+0x231 python36.dll!PyRun_AnyFileExFlags+0x63 python36.dll!Py_hashtable_size+0x5140 python36.dll!Py_FatalError+0x1ed4a python.exe+0x126d kernel32.dll!BaseThreadInitThunk+0xd ntdll.dll!RtlUserThreadStart+0x21

mikeboers commented 7 years ago

To confirm, that stack trace is from a recent commit, right? Otherwise I wouldn't expect to see PyAV appearing to call avcodec_send_packet.

Nothing stands out to me. To be honest, I've not developed on Windows in ages, and I'm not really sure how the Windows build works and what it may or may not do with regard to threads. 😟

mikeboers commented 7 years ago

Summoning @caspervdw ...

crackwitz commented 7 years ago

I'm not sure what commit exactly that trace was from, but this is from the most recent commit of this repo on github:

ntoskrnl.exe!memset+0x64a ntoskrnl.exe!KeWaitForMultipleObjects+0xd52 ntoskrnl.exe!KeWaitForMutexObject+0x19f ntoskrnl.exe!_misaligned_access+0xbd4 ntoskrnl.exe!_misaligned_access+0x186d ntoskrnl.exe!IoFreeErrorLogEntry+0x287 avcodec-57.dll!avpriv_exif_decode_ifd+0x7ab37 avcodec-57.dll!avpriv_exif_decode_ifd+0xc657c avcodec-57.dll!avpriv_exif_decode_ifd+0xc7aa0 avcodec-57.dll!avpriv_exif_decode_ifd+0xcde43 avcodec-57.dll!avcodec_decode_video2+0x210 avcodec-57.dll!avcodec_decode_audio4+0xac0 avcodec-57.dll!avcodec_send_packet+0xac context.cp36-win_amd64.pyd!PyInit_context+0x5d0e context.cp36-win_amd64.pyd!PyInit_context+0x39f4 stream.cp36-win_amd64.pyd!PyInit_stream+0x28af stream.cp36-win_amd64.pyd!PyInit_stream+0x2afa python36.dll!PyCFunction_Call+0xd7 packet.cp36-win_amd64.pyd+0x39df packet.cp36-win_amd64.pyd!PyInit_packet+0x24b7 packet.cp36-win_amd64.pyd!PyInit_packet+0x271a python36.dll!PyCFunction_FastCallDict+0x305 python36.dll!PyObject_GenericGetAttr+0xa3 python36.dll!PyEval_EvalFrameDefault+0x3bf python36.dll!PyErr_Occurred+0x1aa python36.dll!PyEval_EvalCodeEx+0x8e python36.dll!PyEval_EvalCode+0x2d python36.dll!PyArena_Free+0xa7 python36.dll!PyRun_FileExFlags+0xb5 python36.dll!PyRun_SimpleFileExFlags+0x231 python36.dll!PyRun_AnyFileExFlags+0x63 python36.dll!Py_hashtable_size+0x5140 python36.dll!Py_FatalError+0x1ed4a python.exe+0x126d kernel32.dll!BaseThreadInitThunk+0xd ntdll.dll!RtlUserThreadStart+0x21

I'm not sure what Windows or MSVC needs either. If anyone else has an idea, I'd be grateful. Otherwise, I might eventually find some time and dig into this...

caspervdw commented 7 years ago

I also don't know what to expect from MSVC/Windows with respect to multithreading.

However, the difference between the Anaconda version and the self-built version is surprising. This build is performed automatically on Appveyor, which uses Windows in combination with the correct MSVC versions. In case you are interested, see here for the conda-forge build recipe of PyAv. The ffmpeg "recipe" is also at conda-forge, but this basically just mirrors the Zeranoe (MingW64). We haven't figured out how to build ffmpeg using MSVC.

Are you 100% sure about the functional difference between v0.3.3 from Anaconda (conda install -c conda-forge av=0.3.3) and your self-built version?

As a step in between, you could use precisely the same ffmpeg version as the Anaconda build does (2.8.6 shared+dev packages from Zeranoe, see here) and then compile PyAV pointing to the correct dir using e.g. the --ffmpeg-dir=<C:\ffmpeg option.

crackwitz commented 7 years ago

I'm positive I've had a package of PyAV that decoded my material at 150 fps using much of my CPU, and the self-built PyAV using the 3.3.1 ffmpeg binaries is not even coming close to that, and not using any threads except the main thread.

I suspect it might be the ffmpeg version difference. I have 3.3.1 binaries, the build script you linked is asking for 2.8.6. I would have tried that version if zeranoe still had builds of that time. I think I'll scrape these out of the conda package (which has 2.8.6).

It might be worthwhile to investigate the changes in API from ffmpeg 2.8 to 3.3. there must be something that either fails to configure multithreading, or isn't nudged the right way.

crackwitz commented 7 years ago

ok so I scraped the 2.8.6-5 dlls from the ffmpeg conda-forge package, built pyav git head against this, put the scraped DLLs in the path, and it decodes using multiple threads at 165 fps.

this must definitely be some API change in ffmpeg from 2.8.6 to 3.3.1.

I think creating a new issue along the lines of "update ffmpeg bindings" would be nice. I consider this issue "solved" :)