Open adarshyoga opened 1 year ago
The slowdown maybe related to kernel launch overhead in the JitKernel
custom dispatcher class. Overhead is especially noticeable with small problem sizes. The experimental.dispatcher.KernelDispatcher
fixes the launch overhead.
Can you please reevaluate with the new dispatcher?
The blackscholes numpy implementation in dpbench is ~26X slower than the corresponding kernel and prange implementations.
How to reproduce: 1) Follow instructions to setup dpbench. 2) Run blackscholes -
python -c "import dpbench; dpbench.run_benchmark(\"black_scholes\")"