Import matmul/conv/attention tests from iree/tests/e2e/

ScottTodd commented 3 months ago

Could start with what already exists, including the C++ binaries files and CMake build system:

Or... run the generator scripts and jump straight to a prototype replacement test runner (pytest? other cmake functions?)

The test generators seem to be using mostly upstream dialects (e.g. linalg.conv_2d_nchw_fchw). The build system usage does some fancy things with flags though... should watch out for layering in there. For example:

  COMPILER_FLAGS
    "--iree-opt-data-tiling=false"
    "--iree-llvmcpu-enable-scalable-vectorization"
    "--iree-llvmcpu-target-triple=aarch64-unknown-unknown"
    "--iree-preprocessing-pass-pipeline=builtin.module\(util.func\(iree-preprocessing-transpose-matmul-pass{input=lhs}\)\)"
  LABELS
    "requires-arm-sme"
  TARGET_CPU_FEATURES_VARIANTS
    "arm_64:sme:+sve,+sme"

See this related thread on Discord: https://discord.com/channels/689900678990135345/1270451599231156266

ScottTodd commented 3 months ago

Can also draw on code from https://github.com/nod-ai/rocm-gemm-benchmark/. That has test case generators and checked in .mlir files.

ScottTodd commented 3 months ago

Current plan (cc @erman-gurses ):

Land https://github.com/iree-org/iree/pull/17751 for interim coverage of attention on CPU (and possibly ROCm)
Fork C++ and CMake matmul test code to this repo (including all transitive deps, a new CMake project that depends on the runtime, etc.)
Fork C++ and CMake conv and attention test code to this repo
Prototype a new test structure (check in generated files, rework CMake/ctest project or use pytest, etc.)

ScottTodd commented 3 months ago

Finding lots to clean up while refactoring the matmul tests. Debating starting fresh or cleaning up in-place.

Notes/ideas:

[ ] https://github.com/iree-org/iree-test-suites/pull/14 (run the generator offline and check in the test case files)
[ ] https://github.com/iree-org/iree-test-suites/pull/15
[ ] Remove the concept of "sizes" from the generator. Generate one file per test case with only a single function / matmul shape.
[ ] Maintain a list of interesting problem sizes sourced from general sweeps of common sizes and specific sizes used in popular workloads, like in https://github.com/nod-ai/rocm-gemm-benchmark/blob/main/gemmbench/problems.py
[ ] Come up with some directory structure that gets the number of files / test cases per folder down to under 100-1000 as needed. Maybe group by LHS/RHS data type, with deeper subfolders for transpose variants?
[ ] Drop the infer_acc_type helper function and make all calls into the generator fully explicit about what data types they want to use for the LHS, RHS, and accumulator
[ ] Decide on continuing to use iree_generated_e2e_runner_test or writing something new (pytest?). Start with manual testing and see what is needed from the build system / test runner. Compiler and runtime flagfiles could be used instead of plumbing strings through CMake functions 🤔
[ ] Merge _matmul.mlir and _calls.mlir into a single file for each test case? One entry function to run arbitrary inputs -> outputs, a second function to run with generated inputs and checking provided by the test module? Check optional symbol resolution...

iree-org / iree-test-suites

Import matmul/conv/attention tests from iree/tests/e2e/ #2