scikit-learn / scikit-learn

scikit-learn: machine learning in Python
https://scikit-learn.org
BSD 3-Clause "New" or "Revised" License
60.16k stars 25.41k forks source link

Split binary_tree.pxi into binary_tree.pyx + binary_tree.pxd and import rather than include #6409

Open raghavrv opened 8 years ago

raghavrv commented 8 years ago

There is a pxi file at neighbors/binary_tree.pxi

I may be wrong here, but isn't importing a pyx rather than including the pxi a better/cleaner way? Or is it just the same? (If so is there a specific reason why that won't work here?)

Also the way the current cythonize.py is setup, it will track for changes in pyx/pxd files alone and recythonize them if changed.

If the binary_tree.pxi is to be untouched, we need a minor hack to make it compile the kd_tree.pyx and ball_tree.pyx whenever the binary_tree.pxi is modified. (Also see #6254)

If the binary_tree.pxi can be split into binary_tree.pyx + binary_tree.pxd, I'd like to take it up as I am trying to learn cython. (For fiddling with the sklearn/tree module)

cc: @jakevdp @amueller @ogrisel @MechCoder

jakevdp commented 8 years ago

When I wrote this code, I initially started with a pyx/pxd solution, but I found that the standard inheritence scheme for some reason led to massive slowdowns in queries, because of the details of cython's vtable lookups IIRC. That was several years ago, so Cython may have improved things since then.