Closed JulienBalianSonos closed 1 year ago
Hey! Is there a chance this will be merged soon? It was open for quite some time and we have actually hit this issue just now. Would be great to fix!
Hey! Is there a chance this will be merged soon? It was open for quite some time and we have actually hit this issue just now. Would be great to fix!
The reason this has not been merged is that I had some questions regarding the supposed fix to which I did not get an answer. Specifically, why is there a need for a class to be used in the code, can it be simplified to a function to provide the numpy import? It is not clear to me when that piece of code needs to run to be in the right time. Also another point that's not clear is why install_requires>=1.18 while setup_requires<=1.18? Isn't the result going to be that it will only work with numpy version 1.18? It seems quite a bit arbitrary to me. If you can provide answers for those and make it clear why exactly the request is as it is, then we can proceed with it.
Thanks for getting back to me! Let me try to answer those questions, although after taking a second look, I must say I'm a bit confused as well. I'm looking at this wrt line numbers: https://github.com/KhronosGroup/NNEF-Tools/commit/5caf8501c559bd823f53e1ba9d1e5d23e7533e75
why is there a need for a class to be used in the code, can it be simplified to a function to provide the numpy import
My best guess is that the author was inspired by this SO answer to a similar issue: https://stackoverflow.com/a/62724864/10243384.
The reason to use a class is to defer the execution of the import numpy
to a later time. This is done via a bit hacky way as it relies include_dirs
apparently stringifying inputs at a later point. With this implementation, the class instance is created when creating the module
object in line 29/36. But, and here's my guess, that object is cast to string (thus invoking the __str__
method) much later in the process, after numpy
is installed. It can not be just a function because a function would be executed right away when creating the module
object, causing an error due to missing numpy
.
Perhaps a bit better, albeit not necessarily less cryptic, solution would be one of: https://stackoverflow.com/a/21621689/10243384 or https://stackoverflow.com/a/60740731/10243384. It at least seems like those would use tools designed for this purpose.
Also another point that's not clear is why install_requires>=1.18 while setup_requires<=1.18?
This one, I don't know. I don't have experience with projects compiling C++ and Python code together and I think this might hide some specific things related to version 1.18. numpy 1.19.0 seems to have deprecated support for Python < 3.6 (source), which might be the reason why installation is locked to under 1.18 - the intent might have been to provide wider compatibility.
I'm not sure how a package like this would behave if it was built against one version of numpy and executed against another or whether that is even what would happen here. My intuition would be that those versions should matching. It's a good practice to add an upper-bound to supported package version and carefully update it. E.g. this could say numpy ~= 1.24.0
, which would allow versions >=1.24.0
and <1.25.0
, or numpy~1.24
which would allow versions >=1.24.0
and <2.0.0
. Depending on what's your intent and philosophy, I think this should rather specify minimal version of numpy that this code can run with and then an upper bound of either current minor or major version of numpy. I think this should be clear. If your code is using functions introduced in a certain numpy version, it will not work with previous versions of numpy so the lower boundary should be fixed to that. The upper boundary is more about stability and comfort level. You can test your code against current version of numpy and you can trust future versions of numpy will not introduce breaking changes. They shouldn't, but they might. And then you can have users complaining your package doesn't work with numpy version 1.xx.xx. Breaking changes are expected with a major version upgrade though, so numpy 2.0.0, if ever released, is likely to break compatibility. It'd be advised then to at least limit the supported version to <2.0.0.
Let me know what do you think. I'm happy to open a PR with different changes if we can agree on how this should look like. The reason I'm interested in this issue is that we are using a bit more advanced package manager - Poetry - which, unlike pip, is building the package egg in isolation. So it requires a proper listing of all dependencies in the setup.py
file. Right now we have a workaround for this, and I'm happy to contribute a fix if you think this is worthwhile.
My best guess is that the author was inspired by this SO answer to a similar issue: Thanks for that link, that does explain it. So I think given that we can leave it as it is and accept it.
I am also okay with leaving in install_requires>=1.18
, that seems to make sense. However, as I have read, setup_requires
is already a deprecated keyword argument, and as we don't quite understand its purpose (and I could not find clear documentation of it either), we could start with leaving that one out and see if it works. How about that?
Sounds good, I'll try to check it out on Friday and submit a new PR if it looks like things are working properly
Hello, I think @mt-krainski did a great job to summarize the contribution I tried to make.
The reason this has not been merged is that I had some questions regarding the supposed fix to which I did not get an answer.
@gyenesvi sorry, I didn't notice you tried to reach me for explanation, by curiosity what was the medium ?
Indeed setup_requires
seems officially deprecated ( I didn't notice since pip and poetry seems to still comply to it today), apparently the right way to do things those days are with additional: pyproject.toml and build-system.requires
(but I did not test this new solution).
EDIT: @gyenesvi given your last comment, maybe what confuse you is that there are 1 dependency list per distinct steps: one for the building of the package and one for the usage of the package (both need to include numpy in this case because you link against this lib at build time).
I am happy if @mt-krainski takes over on this. Have a nice day :) .
I didn't notice you tried to reach me for explanation, by curiosity what was the medium ?
I started a review for the MR (see above), I guessed you should have been notified by that.
EDIT: @gyenesvi given your last comment, maybe what confuse you is that there are 1 dependency list per distinct steps: one for the building of the package and one for the usage of the package (both need to include numpy in this case because you link against this lib at build time).
Yes, that's what I have been guessing as well, that those dependencies are for different phases. By the name I guessed that setup_requires
is for the setup itself to run, but that's exactly what this hack with the class is doing, making numpy accessible for the setup phase, at the only point where it's needed, no? So maybe it's not needed to use the setup_requires
flag in the end.
Anyway, I am fine if @mt-krainski tests it, and we can proceed from there.
I started a review for the MR (see above), I guessed you should have been notified by that.
Sadly I do not see any code comment or anything between last commit 3 mo ago and @mt-krainski first comment. Maybe not publishing/finishing the review do prevent me from seeing/being notified by it.
but that's exactly what this hack with the class is doing, making numpy accessible for the setup phase, at the only point where it's needed, no?
All of what I describe below is what happen with current PR, not the new pyproject ways to do thing
The problem here is the import of NumPy, and how setuptools works (and it's underlying std lib: distutils).
Let's imagine we start with a clean environment. This environment only contains setuptools (which is assumed to be almost mandatory package for setup, hence always installed in fresh environments).
In order to run build process (I only highlight few key steps for our concern):
It imports the setup module
Load & evaluate all module components and run the setup function that declare among other things build dependencies.
Once those infos are collected, an object of get_numpy_include
as been instantiated in our extension referenced in ext_modules, but not evaluated as a string (so at this stage NumPy import is not yet called)
it installs build dependencies which from my understanding is kind of a map of
package id with version of install_requires
, updated by setup_requires
package id and versions.
(you can convince yourself of this by removing the setup_requires
and trying an install, it should works by linking to last numpy version, then doing the same with setup_require
, and raising ValueError(numpy.__version__)
within the __str__
of get_numpy_include
it should show 1.18.0
)
and only then it starts to do the build properly speaking,
within which we build the ext_modules (cf: run method of build_ext )
which leads to forced evaluation of the include_dirs
of the Extension
'_nnef' as strings (resolving the include of numpy)
-> you can be convinced of this last step by the look of the stack trace of ValueError
I proposed to set as an experiment in previous bullet-point:
File "/home/epi/.pyenv/versions/3.7-dev/envs/nnef/lib/python3.7/site-packages/setuptools/command/build_ext.py", line 87, in run
_build_ext.run(self)
File "/home/epi/.pyenv/versions/3.7-dev/lib/python3.7/distutils/command/build_ext.py", line 340, in run
self.build_extensions()
File "/home/epi/.pyenv/versions/3.7-dev/lib/python3.7/distutils/command/build_ext.py", line 449, in build_extensions
self._build_extensions_serial()
File "/home/epi/.pyenv/versions/3.7-dev/lib/python3.7/distutils/command/build_ext.py", line 474, in _build_extensions_serial
self.build_extension(ext)
File "/home/epi/.pyenv/versions/3.7-dev/envs/nnef/lib/python3.7/site-packages/setuptools/command/build_ext.py", line 208, in build_extension
_build_ext.build_extension(self, ext)
File "/home/epi/.pyenv/versions/3.7-dev/lib/python3.7/distutils/command/build_ext.py", line 534, in build_extension
depends=ext.depends)
File "/home/epi/.pyenv/versions/3.7-dev/lib/python3.7/distutils/ccompiler.py", line 566, in compile
depends, extra_postargs)
File "/home/epi/.pyenv/versions/3.7-dev/lib/python3.7/distutils/ccompiler.py", line 341, in _setup_compile
pp_opts = gen_preprocess_options(macros, incdirs)
File "/home/epi/.pyenv/versions/3.7-dev/lib/python3.7/distutils/ccompiler.py", line 1076, in gen_preprocess_options
pp_opts.append("-I%s" % dir)
File "/home/epi/SONOS/src/NNEF-Tools/parser/python/setup.py", line 35, in __str__
raise ValueError(numpy.__version__)
ValueError: 1.18.0
So this is why I used this string trick.
About the 2 other solutions for the same problem listed before by @mt-krainski | Perhaps a bit better, albeit not necessarily less cryptic, solution would be one of: https://stackoverflow.com/a/21621689/10243384 or https://stackoverflow.com/a/60740731/10243384. It at least seems like those would use tools designed for this purpose.
On why, we link against an old NumPy version the answer is related to the idea "not robustly" tested though, of backward compatible versions so that when you distribute your built package (by example from your local or global PyPI server) to different final clients, they may have varying upper or equal version of NumPy without need to rebuild package (no CPP dev tooling needs for them)
Also, the upper bound for NumPy should be set under 2.0, this is a very valid point of @mt-krainski.
Hey all, sorry, I wasn't able to sit down to this yet and I'm going for holidays tomorrow, I'll be out for around a week. I can get back to this after I'm back - I'd hope to provide some input by April 21. Alternatively, feel free to proceed without me!
@gyenesvi, I opened a new PR for this change: https://github.com/KhronosGroup/NNEF-Tools/pull/164 please let me know if you have any questions.
Dropping this in favor of #164.
Why
nnef
python package installation breaks with "missing numpy" failure when using this package as dependency without numpy preinstalled (which is problematic for venvs).We also add explicit NumPy dependency, since NumPy usage is not limited to installation.