mlc-ai / relax

Apache License 2.0
149 stars 75 forks source link

[LLVM] Expose Host CPU Feature Detection #229

Closed junrushao closed 1 year ago

junrushao commented 1 year ago

A small script that exposes host CPU name, target triple and features:

```python import tvm def main(): get_default_target_triple = tvm._ffi.get_global_func("tvm.codegen.llvm.GetDefaultTargetTriple") get_process_triple = tvm._ffi.get_global_func("tvm.codegen.llvm.GetProcessTriple") get_host_cpu_name = tvm._ffi.get_global_func("tvm.codegen.llvm.GetHostCPUName") get_host_cpu_features = tvm._ffi.get_global_func("tvm.codegen.llvm.GetHostCPUFeatures") target_triple = get_default_target_triple() process_triple = get_process_triple() host_cpu_name = get_host_cpu_name() host_cpu_features = get_host_cpu_features() print("target_triple: {}".format(target_triple)) print("process_triple: {}".format(process_triple)) print("host_cpu_name: {}".format(host_cpu_name)) print("host_cpu_features:") for name, value in host_cpu_features.items(): print(" {}: {}".format(name, bool(value))) if __name__ == "__main__": main() ```

Output (AMD CPU):

``` target_triple: x86_64-unknown-linux-gnu process_triple: x86_64-unknown-linux-gnu host_cpu_name: znver2 host_cpu_features: xsaveopt: True tsxldtrk: False sse: True movdiri: False mmx: True pku: False amx-int8: False amx-tile: False rdpid: True avx512vbmi2: False cmov: True widekl: False f16c: True bmi: True gfni: False avx512cd: False movdir64b: False rdseed: True clwb: True avx512er: False avx512f: False sse4.2: True avxifma: False sse2: True avx512vp2intersect: False prfchw: True avx512pf: False vaes: False waitpkg: False amx-bf16: False prefetchi: False uintr: False fxsr: True bmi2: True lzcnt: True avx512vbmi: False avx512bf16: False prefetchwt1: False xsaves: True movbe: True rtm: False pclmul: True hreset: False sahf: True fma4: False xop: False vpclmulqdq: False sgx: False avx512vnni: False popcnt: True xsavec: True aes: True avx512vpopcntdq: False kl: False avx512bitalg: False xsave: True avxvnni: False raoint: False clflushopt: True sse4a: True avx512bw: False cx16: True avxvnniint8: False amx-fp16: False cldemote: False rdrnd: True ptwrite: False rdpru: True avx: True adx: True avx512vl: False pconfig: False shstk: False 64bit: True crc32: True sha: True cmpccxadd: False tbm: False serialize: False mwaitx: True avx512ifma: False avx512fp16: False clzero: True avx2: True cx8: True fma: True lwp: False enqcmd: False wbnoinvd: True sse4.1: True avx512dq: False ssse3: True fsgsbase: True invpcid: False sse3: True avxneconvert: False ```

Note that LLVM doesn't guarantee automatic feature detection always succeeds, particularly for newer CPU models and older LLVM builds (e.g. M2 CPU + LLVM 16), the result is usually inaccurate. In this case, i.e. CPU feature detection fails, we will print a warning message and return an empty dict instead.

To properly detect CPU features on macbook, the commands below provided by the system are the most accurate:

sysctl -a machdep.cpu
sysctl -a hw.optional

On linux, usually it is recommended to directly query via:

cat /proc/cpuinfo