change build_tensor logic and unique dtype

bytedance / ByteMLPerf

AI Accelerator Benchmark focuses on evaluating AI Accelerators from a practical production perspective, including the ease of use and versatility of software and hardware.

https://bytemlperf.ai/

Apache License 2.0

188 stars 50 forks source link

change build_tensor logic and unique dtype #54

Closed HanTengfei99 closed 6 months ago

HanTengfei99 commented 6 months ago

add layernorm op
remove unique bf16 due to unsupport on CUDA
add more shapes for gemm ops
for various ops, refactor memory bandwidth computation logic