benchmark fix - Githubissues

Benchmark

New Feature

This PR introduces new benchmark testing parameters, including:

level
- Type: str
- Description: Marks the level of the benchmark.
- Available levels:
- comprehensive (default): Comprehensive testing.
- core: Core testing.
warmup
- Type: int
- Description: The number of warm-up iterations.
- Default value: DEFAULT_WARMUP_COUNT = 1000
iter
- Type: int
- Description: The number of benchmark iterations.
- Default value: DEFAULT_ITER_COUNT = 100
query
- Description: Indicates that the benchmark will only query properties without executing the full benchmark logic.
- Default: This parameter is not set by default.
record
- Type: str
- Description: Specifies the format of the output data.
- Available options:
- none (default)
- log: Logs output in JSON format.
dtype
- Type: list[str]
- Description: Specifies the data types for benchmark testing. Available dtypes can be listed using pytest --help.
- Available data types:
- torch.float16, torch.float32, torch.bfloat16, torch.int16, torch.int32, torch.bool, torch.complex64
metric
- Type: list[str]
- Description: Specifies the metrics covered by the benchmark test.
- Available metrics:
- latency, speedup, tflops, latency_base, accuracy, utilization

This section outlines several structural design adjustments:

Added the BenchmarkMetrics abstraction
- Represents the benchmark information to be recorded for specific operations at specific sizes and data types.
Added the BenchmarkResult abstraction
- Represents all test results for a specific operation on specific hardware and at a specified benchmark level.
Adjusted the design of the Benchmark structure
- Changed the per-operator Function-level benchmark to a Class-level benchmark for a category of operators, facilitating unified configuration of default benchmark parameters and allowing for inheritance and overrides.

The previous testing data was based on a specific batch, optional size list, and optional dtype list for combinatorial testing. This approach was somewhat limited in expression. It has now been changed to a more abstract input_generator, which provides a default input generator. Special input scenarios can directly override the corresponding generator.

FlagOpen / FlagGems