Closed WrathfulSpatula closed 3 years ago
(Sorry for the careless branching.)
There was a point where I noticed that QEngineOCL::ParSum()
was slower when parallelized, long ago, if there weren't enough summand terms. This is no longer the case, since ParallelFor()
sets a better threshold for parallelism, internally, and it will execute in serial instead if the summand count is below the threshold. Also, I check for that same threshold within these two methods, before attempting to dispatch in parallel.
I reconsidered: the parallelization is less important than the memory usage and redundancy.
There doesn't seem to be any obvious downside to parallelizing sum-over-amplitude methods, when they contain enough summands to benefit from it.