Set of Python bindings to C++ libraries which provides full HW acceleration for video decoding, encoding and GPU-accelerated color space and pixel format conversions
Apache License 2.0
1.32k
stars
233
forks
source link
makefromDevicePtrUint8 returns inaccurate data when running in multiple processes #546
I am using PyNvDecoder to decode video, and use makefromDevicePtrUint8() to wrap the decoded frames to torch.tensor. It works fine when running in a single process, but generates some strange frames when running in multiple processes. After some investigations, we found that sometimes makefromDevicePtrUint8 returns inaccurate data when running in multiple processes.
Specifically, to download a surface from gpu to cpu, makefromDevicePtrUint8().cpu().numpy() and PySurfaceDownloader generate different results, and the result from makefromDevicePtrUint8 is corrupted.
I am using PyNvDecoder to decode video, and use makefromDevicePtrUint8() to wrap the decoded frames to torch.tensor. It works fine when running in a single process, but generates some strange frames when running in multiple processes. After some investigations, we found that sometimes makefromDevicePtrUint8 returns inaccurate data when running in multiple processes. Specifically, to download a surface from gpu to cpu, makefromDevicePtrUint8().cpu().numpy() and PySurfaceDownloader generate different results, and the result from makefromDevicePtrUint8 is corrupted.
environment:
reproduce code:
sample video: samplevideo