PaddlePaddle / Paddle

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
http://www.paddlepaddle.org/
Apache License 2.0
22.3k stars 5.62k forks source link

测试cinn单算子时间问题 #69304

Open wangzy0327 opened 2 weeks ago

wangzy0327 commented 2 weeks ago

bug描述 Describe the Bug

Paddle版本 release/v2.6.0 GPU Nvidia V100 编译paddle后,单元测试Paddle的算子测试用例,发现测试CINN算子时间很长,好像是包含了其他时间,没有只测CINN算子执行时间。请问该用例如何只测试CINN算子执行时间? 单算子测试用例代码链接:https://github.com/PaddlePaddle/Paddle/blob/release/2.6/test/cinn/ops/test_abs_op.py 基于上述的代码链接,添加测试时间部分代码

import time

def build_paddle_program(self, target):
        x = paddle.to_tensor(self.x_np, stop_gradient=True)
        # 记录开始时间
        start_time = time.time()
        out = paddle.abs(x)
        end_time = time.time()
        # 计算执行时间
        execution_time = end_time - start_time
        print(f"Paddle Execution time: {execution_time:.6f} seconds")

        self.paddle_outputs = [out]

    def build_cinn_program(self, target):
        builder = NetBuilder("identity")
        x = builder.create_input(
            self.nptype2cinntype(self.case["x_dtype"]),
            self.case["x_shape"],
            "x",
        )
        out = builder.abs(x)

        prog = builder.build()

        # 记录开始时间
        start_time = time.time()

        res = self.get_cinn_output(prog, target, [x], [self.x_np], [out])

        end_time = time.time()
        # 计算执行时间
        execution_time = end_time - start_time

        print(f"CINN Execution time: {execution_time:.6f} seconds")

        self.cinn_outputs = [res[0]]

测试结果

Current Paddle device : gpu:0

Running TestAbsOpShape.TestAbsOpShape0: {'x_shape': [1], 'x_dtype': 'float32'}
Paddle Execution time: 0.164408 seconds
CINN Execution time: 1.507690 seconds
Current Paddle device : gpu:0

Running TestAbsOpShape.TestAbsOpShape1: {'x_shape': [1024], 'x_dtype': 'float32'}
Paddle Execution time: 0.000076 seconds
CINN Execution time: 0.756306 seconds
Current Paddle device : gpu:0

Running TestAbsOpShape.TestAbsOpShape2: {'x_shape': [1, 2048], 'x_dtype': 'float32'}
Paddle Execution time: 0.000171 seconds
CINN Execution time: 0.722324 seconds
Current Paddle device : gpu:0

Running TestAbsOpShape.TestAbsOpShape3: {'x_shape': [1, 1, 1], 'x_dtype': 'float32'}
Paddle Execution time: 0.000140 seconds
CINN Execution time: 0.760242 seconds
Current Paddle device : gpu:0

Running TestAbsOpShape.TestAbsOpShape4: {'x_shape': [32, 64], 'x_dtype': 'float32'}
Paddle Execution time: 0.000108 seconds
CINN Execution time: 0.765401 seconds
Current Paddle device : gpu:0

Running TestAbsOpShape.TestAbsOpShape5: {'x_shape': [16, 8, 4, 2], 'x_dtype': 'float32'}
Paddle Execution time: 0.000079 seconds
CINN Execution time: 0.788273 seconds
Current Paddle device : gpu:0

Running TestAbsOpShape.TestAbsOpShape6: {'x_shape': [16, 8, 4, 2, 1], 'x_dtype': 'float32'}
Paddle Execution time: 0.000115 seconds
CINN Execution time: 0.731257 seconds
Current Paddle device : gpu:0

Running TestAbsOpDtype.TestAbsOpDtype0: {'x_shape': [32, 64], 'x_dtype': 'int32'}
Paddle Execution time: 0.000089 seconds
CINN Execution time: 0.721283 seconds
Current Paddle device : gpu:0

Running TestAbsOpDtype.TestAbsOpDtype1: {'x_shape': [32, 64], 'x_dtype': 'int64'}
Paddle Execution time: 0.000092 seconds
CINN Execution time: 0.764283 seconds
Current Paddle device : gpu:0

Running TestAbsOpDtype.TestAbsOpDtype2: {'x_shape': [32, 64], 'x_dtype': 'float16', 'max_relative_error': 0.001}
Paddle Execution time: 0.000149 seconds
CINN Execution time: 0.727614 seconds
Current Paddle device : gpu:0

Running TestAbsOpDtype.TestAbsOpDtype3: {'x_shape': [32, 64], 'x_dtype': 'float32'}
Paddle Execution time: 0.000118 seconds
CINN Execution time: 0.769592 seconds
Current Paddle device : gpu:0

Running TestAbsOpDtype.TestAbsOpDtype4: {'x_shape': [32, 64], 'x_dtype': 'float64'}
Paddle Execution time: 0.000132 seconds
CINN Execution time: 0.726039 seconds

Finished running test_abs_op.py.

测试Paddle调库的时间与CINN时间差异很大,请问CINN的测试时间,在该单元测试用例里如何进行? 谢谢

其他补充信息 Additional Supplementary Information

No response

wangzy0327 commented 2 weeks ago

@wanghuancoder 可以帮忙看一下么?

warrentdrew commented 2 weeks ago

您好,可以测试下build_and_get_output这个方法里的execute的时间 https://github.com/PaddlePaddle/Paddle/blob/0a4a6b0b8ef0b0ffd76806605a90e41a622c28cf/paddle/cinn/pybind/frontend.cc#L243