Closed EricR86 closed 3 years ago
The Genomedata Docker testing/build image needs to be updated to include the Python development packages.
That's awesome.
The extension must be compiled currently with the additional -UNDEBUG to keep current assert macros from compiling out since they wrap function calls with side effects that also have return values.
I think using assert()
for this purpose was a mistake (my mistake, obviously). See #47.
Instead of adding -UNDEBUG
can you #undef NDEBUG
at the top of the C file? I think that will be safer and be more local.
It is possible to get this simple python extension working in both Python 2 and 3 but as it is it does not work. https://docs.python.org/2/howto/cporting.html?highlight=pymoduledef
I've disabled the Python 2 build, for now, to verify everything else works.
After responses to feedback and a decision on Python version portability I will clean up.
Python 2 has been EOL for 15 months. Avoiding this extra work is a good enough reason to toss Python 2 compatibility aside at this point.
Done with my replies.
I will remove Python 2 from the Genomedata docker build image and update the regression test environment accordingly.
This PR is now ready for review @michaelmhoffman
Of note, I've bumped the minimum Python 3 version to 3.7 as it is currently the minimum supported version for the current Numpy.
The Python 3 C API has been stable from 3.2 onwards and the previous minimum version we supported was 3.4.
This PR replaces genomedata-load-data, which was previously a built c binary through distutils to a python script that calls an entry point into a python extension implemented a c shared library.
This has some immediate benefits:
pip install -e .
works and enables debugging on the extension as well.wheels
which should simplify and speed up installation.Some important notes:
genomedata-load-data
is slightly different in formatting and help text. It also currently does not have-?
as an option for--help
while the previous binary version did.-UNDEBUG
to keep currentassert
macros from compiling out since they wrap function calls with side effects that also have return values.genomedata-load-data
and as such the c-extension function call reads directly from stdin and was marked with an underscore to notify that it's generally not a good python function to call.Profiling
Python C extension
C Binary
There does not seem to be a significant change in performance from a cursory profiling test.