IntelPython / numba-dpex

Data Parallel Extension for Numba
https://intelpython.github.io/numba-dpex/
Apache License 2.0
75 stars 33 forks source link

Lower than expected performance in blackscholes numpy implementation #974

Open adarshyoga opened 1 year ago

adarshyoga commented 1 year ago

The blackscholes numpy implementation in dpbench is ~26X slower than the corresponding kernel and prange implementations.

How to reproduce: 1) Follow instructions to setup dpbench. 2) Run blackscholes - python -c "import dpbench; dpbench.run_benchmark(\"black_scholes\")"

diptorupd commented 9 months ago

The slowdown maybe related to kernel launch overhead in the JitKernel custom dispatcher class. Overhead is especially noticeable with small problem sizes. The experimental.dispatcher.KernelDispatcher fixes the launch overhead.

Can you please reevaluate with the new dispatcher?