Closed chenc29xpeng closed 3 weeks ago
Hello @chenc29xpeng
thanks for creating the issue. The file you linked is using H.264 codec and this codec is not lossless. With decoding H.264 there are certain liberties that decoders have when it comes to the final result. That is why you see different results from different decoders.
opencv
uses the same decoder as FFmpeg under the hood and that is why the result is the same. DALI uses NVIDIA Video Codec SDK to decode videos and this is slightly different implementation.
Hello @chenc29xpeng
thanks for creating the issue. The file you linked is using H.264 codec and this codec is not lossless. With decoding H.264 there are certain liberties that decoders have when it comes to the final result. That is why you see different results from different decoders.
opencv
uses the same decoder as FFmpeg under the hood and that is why the result is the same. DALI uses NVIDIA Video Codec SDK to decode videos and this is slightly different implementation.
@awolant Thanks for your quick reply. But the same problem occurs when I try to use H.265 codec file. H.265 codec is lossless, so in this case, which image is correct after decoding by opencv and dali? h265 mp4: https://drive.google.com/file/d/191-cqQPCkpDoVITA9jujqts7lryij0f7/view?usp=drive_link
Hello @chenc29xpeng H.265 isn't lossless either - there are no lossless video codecs in common use. Both H.264 and H.265 are predictive codecs - a part of the codec operates in the loop, where the previous decoded frame is used to decode the next one. That part has strictly defined output - but it's not the final result of decoding. The final RGB image is obtained by applying chroma upsampling, color space conversion and possibly some postprocessing filters. It's that part that may be different across implementations. Also, I would be very cautious assuming that a particular implementation (especially OpenCV) is somehow the "correct" one based on popularity.
Hello @chenc29xpeng H.265 isn't lossless either - there are no lossless video codecs in common use. Both H.264 and H.265 are predictive codecs - a part of the codec operates in the loop, where the previous decoded frame is used to decode the next one. That part has strictly defined output - but it's not the final result of decoding. The final RGB image is obtained by applying chroma upsampling, color space conversion and possibly some postprocessing filters. It's that part that may be different across implementations. Also, I would be very cautious assuming that a particular implementation (especially OpenCV) is somehow the "correct" one based on popularity.
@mzient what you mean is that it is normal for the pixel values to be different after decoding between opencv and dali? Sorry, I need to confirm this conclusion.
Hi @chenc29xpeng,
it is normal for the pixel values to be different after decoding between opencv and dali? Sorry, I need to confirm this conclusion.
The thing that is defined by the standard is how the YUV raw output should look like. The conversion from it to RGB (interpolation and conversion to a different color space) is subjected to numerical differences, as well as different interpolation methods could be used to improve the perception of produced images. So it is expected that OpenCV/FFmpeg and DALI (that used NVDEC under the hood) can yield different, still valid, results.
Hi @chenc29xpeng,
it is normal for the pixel values to be different after decoding between opencv and dali? Sorry, I need to confirm this conclusion.
The thing that is defined by the standard is how the YUV raw output should look like. The conversion from it to RGB (interpolation and conversion to a different color space) is subjected to numerical differences, as well as different interpolation methods could be used to improve the perception of produced images. So it is expected that OpenCV/FFmpeg and DALI (that used NVDEC under the hood) can yield different, still valid, results.
@JanuszL Thank you for your thoughtful explanation, I got the answer I wanted.
Describe the question.
Description
I'm doing lossless decoding of mp4 to image. I want to know if DaLi's gpu-based video decoding is lossy. I compared the image pixels decoded by
opencv
andDaLi
. I found that the two are not the same, and the maximum difference is 43. Why is there a difference?By the way, I think the image decoded by
opencv
is lossless, because I compared the pixel values of the image decoded byffmpeg-python
andopencv
, they are exactly the same.Can anyone help check the code or give a reasonable explanation?
Video
The video I used is the official example: https://github.com/NVIDIA/DALI_extra/blob/main/db/optical_flow/sintel_trailer/sintel_trailer_short.mp4
Code
Check for duplicates