ROCm / Tensile

Stretching GPU performance for GEMMs and tensor contractions.
MIT License
225 stars 151 forks source link

Generalized profiling scripts #2011

Closed bstefanuk closed 2 months ago

bstefanuk commented 3 months ago

Summary:

Profiling Tensile's applications, e.g. TensileCreateLibrary, is crucial to understand the runtime statistics required to make meaningful build-time improvements. However, depending on the parameters provided these can take hours to run, making it desirable to run nightly profiles under various contraints, including OS, branch, number of jobs, and target architecture.

Outcomes:

The scripts added in this PR systematize job scheduling to enable nightly scripts for the general user.

Testing and Environment:

Functional testing only via script invocations.

Usage documentation

$ ./scripts/init-cron-nightly.sh --help
Set up cron table for nightly Tensile profiling reports

Usage: ./scripts/init-cron-nightly.sh [--build-id=<id>] [--tensile-path=<path>]

Parameters:
  --build-id: The target docker build ID [default: 14543]
  --tensile-path: Path to root directory of Tensile [default: /home/bstefanu/dev/Tensile]

Example:
  ./scripts/init-cron-nightly.sh --build-id=14354 --tensile-path='path/to/tensile'
$ ./scripts/profile-tcl.sh --help
Run grid-based profiling analysis for TensileCreateLibrary under variable inputs

Usage: ./scripts/profile-tcl.sh --build-id=<build-id> [--branch=<branch>] [--arch=<arch>] [--compiler=<compiler>]

Parameters:
  --build-id: The target docker build ID
  --branch: The target branch [default: develop]
  --arch: Target Gfx architecture(s) [default: gfx900]
  --compiler: HIP-enabled compiler (must be in PATH) [default: amdclang++]

Example:
  ./scripts/profile-tcl.sh --build-id=12345 --branch=develop --arch='gfx90a'

Dependencies:
  docker: Docker is implicitly called and may install images
$ ./scripts/run-tcl.sh --help
Run TensileCreateLibrary with timestamped log and build directory

Usage: ./scripts/run-tcl.sh --tensile-path=<tensile-path> --logic-path=<logic-path> --jobs=<jobs> [--arch=<arch>] [--compiler=<compiler>]

Parameters:
  --tensile-path: Path to root directory of Tensile
  --logic-path: Path to directory containing logic files
  --jobs: Number of concurrent processes to use
  --arch: Target Gfx architecture(s) [default: gfx900]
  --compiler: HIP-enabled compiler (must be in PATH) [default: amdclang++]

Example:
  ./scripts/run-tcl.sh --tensile-path=/mnt/host/Tensile --logic-path=/mnt/host/Logic --jobs=16