pydata / numexpr

Fast numerical array expression evaluator for Python, NumPy, Pandas, PyTables and more
https://numexpr.readthedocs.io/en/latest/user_guide.html
MIT License
2.23k stars 210 forks source link

Build system seems not to detect the change of interp_body.cpp #427

Closed 27rabbitlt closed 10 months ago

27rabbitlt commented 1 year ago

HI guys~ I'm here again.

I'm currently figure out the inner magic of Numexpr with output debugging method (I added many fprintf(stdout, xxxx)).

I found when I changed interp_body.cpp, which is included by interpreter.cpp, the build system didn't detect this change and failed to re-compile interpreter.cpp.

Did I miss anything or the build system intended to do so? Any suggestions or instructions are appreciated!

Thanks to all you guys in advance.

robbmcleod commented 1 year ago

This might be an unintended consequence of the shift to using the pyproject.toml modernized setup. You could try adding it into setup.py in the list of source files:

sources = ['numexpr/interpreter.cpp',
           'numexpr/module.cpp',
           'numexpr/numexpr_object.cpp']

If that works for you I can look at patching it or you could send me a pull request.

A trivial work-around is to delete the build directory each time. Since it takes 6 s to recompile on my machine, you aren't losing much time. E.g.

rm -Rf build & pip install -e .

I'm not sure what your motivation is for trying to understand the interpreter, but please realize the person who wrote the original code is no longer on the internet. I should point out the numexpr3 branch I worked on in 2017 might also be of interest. I don't have the time to continue that project, but it's a complete re-write on the Python side. There's still lots of regressions compared to 2.8, however, especially data reductions like .sum() or .mean().

27rabbitlt commented 1 year ago

So much thanks! I will try it as soon as possible. The working environment is in the company so currently I don't have the access to the server.

Yes, you are right, I can remove build directory each time, and another work-around I found useful is to add a harmless blank space in interpreter.cpp each time 😄 It won't take long, and I can have a small rest during the compilation.

I have a strong interest in figuring out how these scientific computation libraries implemented. But Numpy is toooooo complicated and XTensor doesn't support multi-threading well (and the code is a bit difficult for me to understand). I found that Numexpr is powerful with multi-threads and JIT, using simple code (relatively speaking). I wish I can understand the whole project and try to contribute to this community.

I didn't know there was a branch numexpr3, I will check it out.

Thanks a lot again!

27rabbitlt commented 1 year ago

@robbmcleod Hi~ I came with my testing. It doesn't work if we just simply add "interp_body.cpp" into source because in that case setuptools will try to compile interp_body.cpp. But interp_body.cpp is actually not a complete C++ source file, instead, it is included by interpreter.cpp. So there would be compilation error if we try to compile itnerp_body.cpp itself.

It seems that setuptools also can't detect the include dependencies, because if we make changes in *.hpp, it will not re-compile either, however, it should do.

One way to fix this problem is to add all these files including *.cpp and *.hpp into source list in setup.py, and add special MACRO to let them skip compilation.

Take interp_body.cpp as an example, we can add these lines:

#ifdef INCLUDE_FLAG

/* the original code in interp_body.cpp */

#endif

And we add this MACRO INCLUDE_FLAG at the beginning of interpreter.cpp:

#define INCLUDE_FLAG

/* the original code in interpreter.cpp */

In this way, we could ensure that if we make changes to interp_body.cpp, the whole project will be re-compiled; and we won't compile inter_body.cpp itself because the code is in the #ifdef statement and will be skipped when just compiling itself.

This method works in my environment, but it's a bit ugly. What do you think of it?

robbmcleod commented 1 year ago

FWIW, the original setup.py was custom and I believe it ran a clean before each build. Since it's gone now we're seeing this behavior now. It seems to be the three offending files are: operations.hpp, functions.hpp, and interp_body.cpp. These are the files that are being "cut and paste" by the preprocessor into various source files.

Since this "cut and paste" of code already very macro heavy adding what's effectively an include guard isn't out-of-line as a solution. We can wait a couple days to think about it, but if a better solution doesn't present itself I think your proposal is fine.

27rabbitlt commented 1 year ago

@robbmcleod Thanks~ I need to re-think this for some days too, the solution is super ugly.

27rabbitlt commented 10 months ago

seems that no good solutions here unless develop numexpr 3.0. closed