IntelPython / sdc

Numba extension for compiling Pandas data frames, Intel® Scalable Dataframe Compiler
https://intelpython.github.io/sdc-doc/
BSD 2-Clause "Simplified" License
646 stars 62 forks source link

Optimize getitem operations by checking for same indexes #800

Closed kozlov-alexey closed 4 years ago

kozlov-alexey commented 4 years ago

Some performance results (from laptop):

with fix:         median min max compile boxing
name nthreads type size          
DataFrame.getitem_filter_by_value 1 Python 10000000 0.328003 0.318 0.357003 NaN NaN
    SDC 10000000 0.34 0.287 0.404 0.715081 1.126521
  2 SDC 10000000 0.205 0.194 0.243 0.666005 0.958931
  4 SDC 10000000 0.154 0.128 0.176 0.616491 0.935159
without fix (on master):         median min max compile boxing
name nthreads type size          
DataFrame.getitem_filter_by_value 1 Python 10000000 0.314004 0.311996 0.318 NaN NaN
    SDC 10000000 3.748 3.618 4.133 0.731427 0.853143
  2 SDC 10000000 3.158 3.113 3.632 0.78739 0.859766
  4 SDC 10000000 3.06 3.007 3.454 0.813918 0.958313
For some reason on nnlmlp01 (but not on ansatclx1004 and my laptop) SDC is slower than python by 2 times on single thread (on the same test). This needs to be investigated further. with fix on nnlmlp01:         median min max compile boxing
name nthreads type size          
DataFrame.getitem_filter_by_value 1 Python 10000000 0.268062 0.267576 0.270783 NaN NaN
    SDC 10000000 0.361877 0.361485 0.362171 0.612621 1.002072
  2 SDC 10000000 0.213841 0.212518 0.215923 0.672137 1.019482
  4 SDC 10000000 0.120625 0.11833 0.125019 0.622494 1.026524
  8 SDC 10000000 0.075155 0.074595 0.075756 0.624533 1.027116
  16 SDC 10000000 0.059031 0.058086 0.075269 0.612236 1.051869
  28 SDC 10000000 0.054463 0.051837 0.056121 0.608134 1.074058
  56 SDC 10000000 0.060003 0.056724 0.062026 0.644352 1.094665