OpenVisualCloud / SVT-HEVC

SVT HEVC encoder. Scalable Video Technology (SVT) is a software-based video coding technology that is highly optimized for Intel® Xeon® processors. Using the open source SVT-HEVC encoder, it is possible to spread video encoding processing across multiple Intel® Xeon® processors to achieve a real advantage of processing efficiency.
Other
517 stars 171 forks source link

AVX512_changes #478

Open deeptiag1 opened 4 years ago

deeptiag1 commented 4 years ago

Signed-off-by: deeptiag1 deepti.aggarwal@intel.com

1480c1 commented 4 years ago

@deeptiag1, you may want to check your git config, it seems you might have misspelled your email, unless your email domain is intel.comm

tianjunwork commented 4 years ago

Hi @deeptiag1 , since AVX512 code doesn't execute when running encoder, our CI can't validate your code. Could you let us know what tests you did and the result? Thank you very much for your contribution. Could you also give a better title to this PR which be more specific. Thank you.

deeptiag1 commented 4 years ago

Hi @tianjunwork, Enable the AVX512 flag in Ebdefinitions and then run the same test as one would run for any build. Let me know if all test pass.

intelmark commented 4 years ago

From @hassount 3/17/2020:

From the results below, it seems that we need to disable the optimization for the LumaInterpolationFilterOneDOutRawHorizontalAVX512 kernel and make sure that the documentation contains a clear reference to the gcc version requirement when vnni is on, then I don’t see any blockers to this PR moving forward assuming functional testing is passing.

The current PR shows:

ifdef VNNI_SUPPORT

ifndef NON_AVX512_SUPPORT

define LumaInterpolationFilterOneDOutRawHorizontalLumaInterpolationFilterOneDOutRawHorizontal_AVX512

else

define LumaInterpolationFilterOneDOutRawHorizontalLumaInterpolationFilterOneDOutRawHorizontal_SSSE3

endif

But isn't VNNI a subset of AVX512 - Are all combinations of the VNNI_SUPPORT and AVX512_SUPPORT defines possible?

Also there doesn't seem to be a mention of the gcc requirements for compiling the VNNI specific code anywhere in the PR

deeptiag1 commented 4 years ago

define vnni in EbDefinitions enable vnni code committed in this PR.

Perf Data; Encoder mode :0 reference_avx512 : branch vnni_changes_1 @ d92a7c4 vnni_avx512: reference_avx512 + VNNI_SUPPORT + LumaInterpolationFilterOneDOutRawHorizontal_SSSE3 speed reference_avx512 | 0.95fps vnni_avx512 | 0.96fps Speed gain: 1%

Encoder mode :0 reference_avx2 : branch vnni_changes_1 @ d92a7c4 + NON_AVX512_SUPPORT vnni_avx2: reference_avx2 + VNNI_SUPPORT + LumaInterpolationFilterOneDOutRawHorizontal_SSSE3 speed reference_avx2 | 0.96fps vnni_avx2 | 0.97fps Speed gain: 1%

intelmark commented 4 years ago

@deeptiag1 Regarding the performance data - great to see the speed up, but what size video and is there a command line that was used?