Closed XiaobingSuper closed 6 years ago
@zxb1489479870 At first, make sense the following rules:
Then check the perf boost radio between parallelization and seriation by varying the size of tensor. The threshold will be determined according to the boost status. Our optimization has been merge into the master branch. you'd better check the code in my pr.
I want to test the performance by setting a different omp theshold, I changed the OMP_OVERHEAD_THRESHOL numbers in [THTensorApply.hpp]((https://github.com/pytorch/pytorch/blob/master/aten/src/TH/generic/THTensorApply.hpp) for series or parallel and made a test in Platinum 8180 , but I found there has a little difference between seris and parallel. Can you tell me how do you set the omp theshold number to get you results? Thanks!