bytedance / ByteMLPerf

AI Accelerator Benchmark focuses on evaluating AI Accelerators from a practical production perspective, including the ease of use and versatility of software and hardware.
https://bytemlperf.ai/
Apache License 2.0
188 stars 50 forks source link

change build_tensor logic and unique dtype #54

Closed HanTengfei99 closed 6 months ago

HanTengfei99 commented 6 months ago
  1. add layernorm op
  2. remove unique bf16 due to unsupport on CUDA
  3. add more shapes for gemm ops
  4. for various ops, refactor memory bandwidth computation logic