hpyproject / numpy-hpy

The fundamental package for scientific computing with Python.
https://numpy.org
Other
20 stars 1 forks source link

Building with PyPy #15

Open mattip opened 11 months ago

mattip commented 11 months ago

When building, I get lots of warnings about using forbidden API. I think this is because the code builds with HPY_ABI_UNIVERSAL. Shouldn't it use HPY_ABI_HYBRID until the migration is complete?

mattip commented 11 months ago

Hmm. Tracing where this is defined, it seems it is here https://github.com/hpyproject/numpy-hpy/blob/a1f0ad065b0ed391c13a8ba4d70a1378cb422da7/numpy/distutils/command/build_ext.py#L84

What defines self.distribution.hpy_abi?

mattip commented 11 months ago

Maybe there should be a command line option to override it?

mattip commented 11 months ago

There is supposed to be a command line option:

python setup.py build_ext --help | grep hpy
Running from numpy source directory.
/tmp/venv3-hpy/lib/pypy3.9/site-packages/setuptools/installer.py:27: SetuptoolsDeprecationWarning: setuptools.installer is deprecated. Requirements should be satisfied by a PEP 517 installer.
  warnings.warn(
  --hpy-abi             Specify the HPy ABI mode (default: universal)
  --hpy-no-static-libs  Compile context and extra sources with extension

but it doesn't work:

python setup.py build_ext --hpy-abi hybrid
...
running build_ext
error: error in command line: command 'new_build_ext_hpy' has no such option 'hpy_abi'
mattip commented 11 months ago

I can work around this by forcing DEFAULT_HPY_ABI = 'hybrid' in the installed hpy (<base pypy>/lib/pypy3.9/hpy/devel/__init__.py). But now I hit a compilation error:

INFO: compile options: '-D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE=1 -D_LARGEFILE64_SOURCE=1 -DNPY_NO_DEPRECATED_API=0 -DHPY -DHPY_ABI_HYBRID -DHPY_ABI_CPYTHON -Inumpy/random -Inumpy/random/src -I/home/matti/oss/pypy3.9-hpy/lib/pypy3.9/hpy/devel/include -Inumpy/core/include -Ibuild/src.linux-x86_64-3.9/numpy/core/include/numpy -Ibuild/src.linux-x86_64-3.9/numpy/distutils/include -Inumpy/core/src/common -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/src/npysort -Inumpy/core/src/_simd -I/tmp/venv3-hpy/include -I/home/matti/oss/pypy3.9-hpy/include/pypy3.9 -Ibuild/src.linux-x86_64-3.9/numpy/core/src/common -Ibuild/src.linux-x86_64-3.9/numpy/core/src/npymath -c'
extra options: '-U__GNUC_GNU_INLINE__ -std=c99 -msse -msse2 -msse3'
INFO: gcc: /home/matti/oss/pypy3.9-hpy/lib/pypy3.9/hpy/devel/src/runtime/argparse.c
INFO: gcc: numpy/random/_generator.c
INFO: gcc: /home/matti/oss/pypy3.9-hpy/lib/pypy3.9/hpy/devel/src/runtime/buildvalue.c
INFO: gcc: /home/matti/oss/pypy3.9-hpy/lib/pypy3.9/hpy/devel/src/runtime/format.c
In file included from /home/matti/oss/pypy3.9-hpy/lib/pypy3.9/hpy/devel/src/runtime/argparse.c:138:
/home/matti/oss/pypy3.9-hpy/lib/pypy3.9/hpy/devel/include/hpy.h:38:6: error: #error "Conflicting macros are defined: HPY_ABI_CPYTHON and HPY_ABI_HYBRID"
   38 | #    error "Conflicting macros are defined: HPY_ABI_CPYTHON and HPY_ABI_HYBRID"
mattip commented 11 months ago

The HPY_ABI_CPYTHON macro is unconditionally defined here: https://github.com/hpyproject/numpy-hpy/blob/a1f0ad065b0ed391c13a8ba4d70a1378cb422da7/numpy/core/setup.py#L751

Is that on purpose?

mattip commented 11 months ago

Also this line is tripping up the build, it should be 0.9.0 not a git ref https://github.com/hpyproject/numpy-hpy/blob/a1f0ad065b0ed391c13a8ba4d70a1378cb422da7/pyproject.toml#L9

mattip commented 11 months ago

I pushed some fixes in 23f4a1898d, building succeeds but import still fails on PyPy

mattip commented 11 months ago

Something is off with the interaction of HPyGlobal_Store and HPyField_Load on PyPy, it seems the global store is not preventing objects from being garbage collected. This is with my latest commit. Maybe we could simplify the code, and not both create a global and also store the same handle as a field on the dtype->singleton?

mattip commented 10 months ago

In order for the PyPy GC to recognize HPyFields, the call to HPyField_Store and HPyField_Load must be also given the HPyObject holding the relevant storage struct. But in this code the HPyField is in the dt_slots struct held by PyArray_DTypeMeta *new_dtype_data and not on new_dtype_data itself. This is the HPyField that later causes HPyField_Load to fail. https://github.com/hpyproject/numpy-hpy/blob/23f4a1898d37af3dd3525144eb800c64abfdb495/numpy/core/src/multiarray/dtypemeta.c#L750-L757

I found this by adding debug code inside HPyField_Store that checks that the storage pointer is reasonably "close" to the HPyField value (after discussion with @antocuni).

I do not see a way that PyPy can chase down the HPyField hidden in the void *slots struct. I think that for the HPy port we must embed the struct inside the PyArray_DTypeMeta. https://github.com/hpyproject/numpy-hpy/blob/23f4a1898d37af3dd3525144eb800c64abfdb495/numpy/core/include/numpy/experimental_dtype_api.h#L161-L171.

antocuni commented 10 months ago

In order for the PyPy GC to recognize HPyFields, the call to HPyField_Store and HPyField_Load must be also given the HPyObject holding the relevant storage struct. But in this code the HPyField is in the dt_slots struct held by PyArray_DTypeMeta *new_dtype_data and not on new_dtype_data itself. This is the HPyField that later causes HPyField_Load to fail.

does this mean that the numpy code is buggy and that it happens to work "by chance" on CPython and GraalPy, but it actually violates the contract? If so, then the ideal solution would be to implement a relevant check in the debug mode, if it's possible.

cfbolz commented 10 months ago

The docs say this:

HPyFields should be used ONLY in parts of memory which is known to the GC, e.g. memory allocated by HPy_New

Which is not the case in the example here, the memory is allocated by MEM_MALLOC. So either the docs are wrong or the code. However, HPy_New isn't documented at all so far, so the current restrictions are quite unclear to me.

mattip commented 10 months ago

The code is "wrong", but useful, and it works on HPy + CPython and on HPy + GraalPython.

antocuni commented 10 months ago

ok, I had a deeper look at the code. I confirm that the code is "wrong", i.e. does not respect the HPy contract, at least in the way that we/I designed it. It's totally possible that the docs are not complete or clear, but then we should fix the docs.

It's also possible that the current HPy API is not powerful enough to cover this case. But I need to think more about this because there is still something not 100% clear in my mind

antocuni commented 10 months ago

ok, I'm confused now. Looking at the code, I retract what I said earlier:

I confirm that the code is "wrong", i.e. does not respect the HPy contract, at least in the way that we/I designed it.

I think that the code is correct, because HPy_VISIT correctly reports the &dtslots->castingimpls. I think that the docs are wrong, it should be fine to put an HPyField inside a malloc()ed memory as long as it is reachable from a GC-managed memory and tp_traverse knows about it, which is the case.

So, the summary is: I don't know what's going on :) Could be a numpy bug, an hpy bug or a pypy bug :(

mattip commented 9 months ago

Closing. There were a lot of problems with PyPy, I think I have gotten most of them. There is a missing tp_traverse on the New_PyArrayDescr_spec (which derives from the New_PyArrayDescr_spec_prototype), since PyArray_Descr has HPyFields, but that is a different issue.