Closed ReinForce-II closed 3 years ago
benchmark: super_resolution_10
5900X: before: Constant-10 default 10 0.000(us) Conv-10 default 20 1496819.700(us) Relu-10 default 15 964.000(us) Reshape-10 default 10 124.800(us) Transpose-10 default 5 6619.800(us)
after: Constant-10 default 10 0.200(us) Conv-10 default 20 692877.650(us) Relu-10 default 15 795.733(us) Reshape-10 default 10 129.200(us) Transpose-10 default 5 6785.200(us)
M1: before: Constant-10 default 10 0.000(us) Conv-10 default 20 650797.950(us) Relu-10 default 15 411.600(us) Reshape-10 default 10 37.600(us) Transpose-10 default 5 3822.400(us)
after: Profiler analysis: Constant-10 default 10 0.000(us) Conv-10 default 20 640919.200(us) Relu-10 default 15 368.800(us) Reshape-10 default 10 37.200(us) Transpose-10 default 5 3711.400(us)
RK3399: A72: before: Constant-10 default 2 0.500(us) Conv-10 default 4 4831419.250(us) Relu-10 default 3 2834.000(us) Reshape-10 default 2 504.000(us) Transpose-10 default 1 18659.000(us)
after: Constant-10 default 2 1.000(us) Conv-10 default 4 2608700.000(us) Relu-10 default 3 2887.333(us) Reshape-10 default 2 525.000(us) Transpose-10 default 1 18725.000(us)
A53: before: Constant-10 default 2 1.000(us) Conv-10 default 4 15421956.500(us) Relu-10 default 3 6136.667(us) Reshape-10 default 2 971.000(us) Transpose-10 default 1 47123.000(us)
after: Constant-10 default 2 0.500(us) Conv-10 default 4 7815257.500(us) Relu-10 default 3 6194.667(us) Reshape-10 default 2 964.500(us) Transpose-10 default 1 47420.000(us)
share data across kernels.