hpyproject / hpy

HPy: a better API for Python
https://hpyproject.org
MIT License
1.02k stars 52 forks source link

Emit HPy ABI version into the binary and check it when loading a module #387

Closed steve-s closed 1 year ago

steve-s commented 1 year ago

371 "Introduce the Hybrid ABI" introduced:

This PR adds: 1) another constant HPY_ABI_VERSION_MINOR (see below) 2) when compiling HPy extension: add exported symbols with values of HPY_ABI_VERSION and HPY_ABI_VERSION_MINOR == ABI versions of HPy that compiled the extension 3) when loading HPy extension: first read those symbols and compare them to its own versions

Details:

When compiling an extension HPy_MODINIT macro creates exported symbols:

export uint32_t get_required_hpy_major_version_mymodule() { return HPY_ABI_VERSION; } // expands to 0 at the moment
export uint32_t get_required_hpy_minor_version_mymodule() { return HPY_ABI_VERSION_MINOR; } // expands to 0 at the moment

The suggestion is that we have two ABI versions. Major versions are ABI incompatible, higher minor version is ABI compatible to lower minor versions, i.e., when adding something at the end of HPyContext => minor++. This is useful, because when checking compatibility one can do something like expected_major == actual_major && expected_minor <= actual_minor.

I expect that we will likely change minor version often, OTOH, introducing ABI breaking changes will happen rarely. Note that with this scheme, one HPy implementation can support multiple ABI versions and the only thing we cannot ever change are the names and types of the two exported symbols. We can turn HPy upside down and move from handles to reference counting and still stay binary compatible.

Question is if we want to encode also minor version in the ABI tag? CPython has abi3 tag, but it also does binary compatible additions. This is solved by also adding Python version, so cp37-abi3 is tag for stable ABI with all the functions available in Python 3.7 stable ABI, it can run on any Python 3.7+. IIRC the cp37 part is sometimes dropped and it's just abi3?

When loading an extension:

I was re-reading this article: https://blog.trailofbits.com/2022/11/15/python-wheels-abi-abi3audit/ and it seems to me that having the ABI version at two places (inside the extension as exported symbols and in the extension filename) is actually a good thing that would prevent the security issues (segfaults) when something goes wrong during packaging/distribution like it happens with CPython stable ABI.

mattip commented 1 year ago

... it seems to me that having the ABI version at two places (inside the extension as exported symbols and in the extension filename) is actually a good thing

It took me a while to understand this, and now I agree it would be a good thing. Would it help non-c-based languages to export functions int get_required_hpy_major_version_mymodule(), int get_required_hpy_minor_version_mymodule() rather than as constants?

steve-s commented 1 year ago

Would it help non-c-based languages to export functions int get_required_hpy_major_version_mymodule(), int get_required_hpy_minor_version_mymodule() rather than as constants?

Absolutely. I missed the non-c languages argument when thinking about constants vs some getter functions.

steve-s commented 1 year ago

No merge conflict, green gates, review comment fixed. This is ready for final review & merge. The open question whether we want to include minor ABI version in the tag can be addressed in another PR.

/cc @antocuni, @mattip

mattip commented 1 year ago

Thanks @steve-s

fangerer commented 1 year ago

AFAIK, this also closes issue #374