apache / incubator-gluten

Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
https://gluten.apache.org/
Apache License 2.0
1.22k stars 437 forks source link

[VL] Clean up some legacy code related to USE_AVX512 #7956

Closed PHILO-HE closed 4 days ago

PHILO-HE commented 1 week ago

What changes were proposed in this pull request?

These are some legacy code inherited from Gazelle. And it's not the correct place to do this setting for compiler if we want it for compiling all native code.

github-actions[bot] commented 1 week ago

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/apache/incubator-gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

PHILO-HE commented 1 week ago

@zhouyuan

FelixYBW commented 6 days ago

Let's add -mno-avx512f -mbmi2 to compile flag, align with Velox.

surnaik commented 5 days ago

Currently, we're passing -march=native, this doesn't seem to respect -mno-avx512f

surnaik commented 5 days ago

Using -march=native ends up generating ASM code with vmovdqu8 instructions when CPU supports AVX512, and makes it incompatible with older CPUs, forcing the build machine and run machine to be of same type. I explicitly passed -mno-avx512f -mbmi2 but it didn't do anything and AVX512 instructions were still present in the ASM code. Any idea why it doesn't respect -mno-avx512f?

PHILO-HE commented 4 days ago

Using -march=native ends up generating ASM code with vmovdqu8 instructions when CPU supports AVX512, and makes it incompatible with older CPUs, forcing the build machine and run machine to be of same type. I explicitly passed -mno-avx512f -mbmi2 but it didn't do anything and AVX512 instructions were still present in the ASM code. Any idea why it doesn't respect -mno-avx512f?

@surnaik, -mno-avx512f only disables the subset of avx512 instructions. I recommend you to remove the use of -march=native in cross compilation and set correct target cpu for compiler. For example, you can set -march=haswell if haswell is used on your side. If your cluster has diverse cpu architectures, you can just use generic setting for compiler. Alternatively, to make the binary fully optimized for newer cpu architecture, I think you have to customize gluten build with different -march set for different cpu architectures, instead of compiling once for universal use.

FelixYBW commented 3 days ago

@surnaik, -mno-avx512f only disables the subset of avx512 instructions. I recommend you to remove the use of -march=native in cross compilation and set correct target cpu for compiler. For example, you can set -march=haswell if haswell is used on your side. If your cluster has diverse cpu architectures, you can just use generic setting for compiler. Alternatively, to make the binary fully optimized for newer cpu architecture, I think you have to customize gluten build with different -march set for different cpu architectures, instead of compiling once for universal use.

-mno-avx512f should take effect after -march=native.