indygreg / PyOxidizer

A modern Python application packaging and distribution tool
Mozilla Public License 2.0
5.49k stars 239 forks source link

PyO3 support #324

Open davidhewitt opened 4 years ago

davidhewitt commented 4 years ago

A PyO3 maintainer here unfamiliar with PyOxidizer, just discovered it this afternoon. πŸ‘‹

Looks like a really cool project! We reasonably often have pain with users who want to embed Python in Rust programs; there's all kinds of subtleties around static linking etc. So it'd be great to understand a little bit about what would be necessary for PyOxidizer to work with PyO3 projects (if it doesn't already).

I'd be very happy to help out with any implementation work needed and also add some notes to the PyO3 guide / add an example in the PyO3 repo.

Also, while reading your docs I noticed this section: https://pyoxidizer.readthedocs.io/en/stable/status.html?highlight=pyo3#pyo3

PyO3

PyO3 are alternate Rust bindings to Python from rust-cpython, which is what pyembed currently uses.

The PyO3 bindings seem to be ergonomically better than rust-cpython. PyOxidizer may switch to PyO3 someday.

I'd be very happy to answer any questions you have about using PyO3 in pyembed & help with the implementation if you're interested in switching. IMO helping with downstream usage from time to time gives me ideas how to improve the ergonomics in PyO3!

indygreg commented 4 years ago

πŸ‘‹!

It should be possible for binaries created with PyOxidizer to load extensions built against PyO3. But it all comes down to how things are linked. As I'm sure you know, this can be a thorny problem space. If extensions are built in a way that is compatible with PyPI supported binary wheels, things should just work. But if we have to recompile from source to get static linking, I expect this to be a minefield. (The Python extension module building ecosystem doesn't really account for the possibility of statically linked extensions, for better or worse.)

PyOxidizer's run-time component (the pyembed crate) currently uses the cpython + python3-sys crates. I would love to move these to PyO3 at some point. I just bumped the minimum Rust version requirement, unblocking us from adopting PyO3.

I was actually planning to spend a few hours looking at things and open an issue against PyO3 listing all the changes we'd need. But you seem to have beat me to it :)

I'm unsure what the extensive list of changes to PyO3 I'll need is. But I do know there will be changes to build.rs to facilitate the linking control PyOxidizer needs. There are also likely missing CPython bindings that will need to be added. https://github.com/dgrunwald/rust-cpython/pulls?q=is%3Apr+author%3Aindygreg shows all the PRs I made to rust-cpython. https://github.com/dgrunwald/rust-cpython/pull/193 is the main one for build.rs linking control. And the bindings to the PEP-587 initialization APIs are also critical. Most of my PRs there are in support of PyOxidizer.

One bridge that needs to be crossed at some point is better support for cross-compiling. The way that both rust-cpython and PyO3 sniff Python interpreter features in their build.rs is invoking a Python interpreter and running some Python code. To support cross-compiling (which PyOxidizer's internals are designed to support), we can't do that. I'd like to see a way to specify some kind of static file that build.rs can read to provide answers to all the questions it seeks answers to. That, or a myriad of Cargo features to do the same. If this feature isn't available, PyOxidizer could compile a host native binary that reacts to the PYTHON_SYS_EXECUTABLE -c <code> invocations the build.rs perform. But this is super hacky. I'd much prefer emitting a static JSON/TOML/etc file and setting an environment variable to tell build.rs pick up Python interpreter metadata from here instead of running an interpreter.

If I had to start from scratch, I'd probably abandon the myriad of Cargo features I added to python3-sys/cpython and go with a single feature to opt in to a build.rs run mode where all Python interpreter config is obtained from a static file at a well-defined location or defined in an environment variable. IMO this is the cleanest approach, as it doesn't require polluting the Cargo features space with a bunch of features that most end-users don't care about and will realistically only confuse regular users.

Hope this helps!

davidhewitt commented 4 years ago

What an interesting response, sorry for my slow reply! Let me respond to some pieces.

If extensions are built in a way that is compatible with PyPI supported binary wheels, things should just work.

Great! I'm going to try to add a PyOxidizer example to the PyO3 repo sometime soon. Expect questions πŸ˜„

PyOxidizer's run-time component (the pyembed crate) currently uses the cpython + python3-sys crates. I would love to move these to PyO3 at some point. I just bumped the minimum Rust version requirement, unblocking us from adopting PyO3.

Whew I see you bumped to 1.45 - I just did the same in PyO3 - that's lucky! And awesome that you're interesting in trying out PyO3; would it be helpful if I opened some PRs to PyOxidizer to do this (no guarantees on it happening fast though πŸ˜›)? Would you want pyembed to support both cpython and pyo3, or migrate it completely?

bindings to the PEP-587 initialization APIs are also critical.

By complete fluke I just merged these the other day (untested though as yet) https://github.com/PyO3/pyo3/pull/1247

One bridge that needs to be crossed at some point is better support for cross-compiling. The way that both rust-cpython and PyO3 sniff Python interpreter features in their build.rs is invoking a Python interpreter and running some Python code.

We recently added a method to PyO3's build.rs for cross-compiling where it sniffs the sysconfig data file instead, so that it doesn't need to invoke a Python interpreter. I think this only works for unix targets though (we had someone who wanted to build an android project).

If I had to start from scratch, I'd probably abandon the myriad of Cargo features I added to python3-sys/cpython and go with a single feature to opt in to a build.rs run mode where all Python interpreter config is obtained from a static file at a well-defined location or defined in an environment variable.

I'd be very interested to hear more about this. It's not exactly your use case, but there's some open questions at the moment about how we configure the minimum / max python version when building portable abi3 wheels. We're dangerously close to merging a swathe of features for specifying the ABI3 version, though I am also wondering whether an environment variable would be a better idea... https://github.com/PyO3/pyo3/pull/1263 https://github.com/PyO3/pyo3/issues/1195

As another thought, I've been wondering about splitting PyO3's build.rs logic into a separate crate, which might allow downstream crates to use the config at their build time too. Not sure whether that'd be interesting for PyOxidizer?

indygreg commented 4 years ago

Whew I see you bumped to 1.45 - I just did the same in PyO3 - that's lucky! And awesome that you're interesting in trying out PyO3; would it be helpful if I opened some PRs to PyOxidizer to do this (no guarantees on it happening fast though πŸ˜›)? Would you want pyembed to support both cpython and pyo3, or migrate it completely?

I would almost certainly migrate to pyo3 completely and drop support for cpython: there's just too much code to maintain compatibility with both.

If you want to submit some PRs, I won't stop you. But there's a lot of code that needs changed and I likely won't merge/push anything to the main branch until the cpython -> pyo3 migration is complete. We're likely looking at a long-running branch with several dozen commits to perform the port.

I think a good place to start would be getting the various python3_sys:: references (usually aliased as pyffi in pyembed) working with pyo3 equivalent. From there, pick a random file defining code available to Python. python_resource_types.rs and python_resource_collector.rs seem like good candidates. The code in importer.rs is much more gnarly.

By complete fluke I just merged these the other day (untested though as yet) PyO3/pyo3#1247

\o/

We recently added a method to PyO3's build.rs for cross-compiling where it sniffs the sysconfig data file instead, so that it doesn't need to invoke a Python interpreter. I think this only works for unix targets though (we had someone who wanted to build an android project).

The sysconfig file is gnarly enough that I'm a bit skeptical on parsing it from Rust, especially for advanced use cases. The direction I took in https://github.com/indygreg/python-build-standalone was to populate the distribution's JSON descriptor file with all sysconfig values.

I'd be very interested to hear more about this. It's not exactly your use case, but there's some open questions at the moment about how we configure the minimum / max python version when building portable abi3 wheels. We're dangerously close to merging a swathe of features for specifying the ABI3 version, though I am also wondering whether an environment variable would be a better idea... PyO3/pyo3#1263 PyO3/pyo3#1195

ABI compatibility is probably special enough to warrant its own feature flags. But there's a raft of low-level and niche features where I think it is better to define everything via an out-of-band machine readable file.

As another thought, I've been wondering about splitting PyO3's build.rs logic into a separate crate, which might allow downstream crates to use the config at their build time too. Not sure whether that'd be interesting for PyOxidizer?

Possibly? But a lot of that logic entails running a Python interpreter to derive interpreter features. PyOxidizer already knows these features because they are statically defined in the special Python distributions produced by https://github.com/indygreg/python-build-standalone. That's why I'd prefer a build mode for python3-sys/cpython/pyo3 that is literally use this explicit configuration that I'm telling you to use.

FWIW it is possible to override a build script (https://doc.rust-lang.org/cargo/reference/build-scripts.html#overriding-build-scripts). For reasons I can't recall, I didn't pursue this path with PyOxidizer. Possibly because I need something more dynamic and having to maintain per-target boilerplate doesn't scale well? I really wish Cargo would let me define an alternative build script for a given crate or the path to an output file containing cargo: lines. This would solve all my problems...

wkschwartz commented 3 years ago

Another reason to switch to PyO3: dgrunwald/rust-cpython#246

davidhewitt commented 3 years ago

Just as a follow-up: I still would very much like to help out with this, however I unexpectedly became extremely busy just before Christmas. I'm starved of time as it is just with PyO3 maintenance, so the reality is that I won't find time to actively carry out a downstream port myself any time soon.

If things free up in the future I'll do my best to come along and help out, though just wanted to let you know in case you were waiting for me to think about this ticket.

indygreg commented 3 years ago

FWIW I've been porting https://github.com/indygreg/python-zstandard to Rust using PyO3. (Not all work has been pushed yet.) Since putting in that work, I feel like I now have a good understanding of PyO3 and its differences from rust-cpython! I do find PyO3's interface to be much more pleasant!

I'm not sure when I'll get around to hacking on PyOxidizer or its PyO3 port. But it is definitely something I want to pursue after the next major release. (I am anticipating having to send PRs to PyO3 for missing functionality and I don't want to be blocked on releasing PyOxidizer due to waiting on a PyO3 release with the needed missing features.)

davidhewitt commented 3 years ago

πŸ‘

FWIW I intend to push a PR in the not too distant future to allow PyO3's build to be configured by a file (probably JSON) - I'll be sure to cc you when that happens to make sure that we capture everything that you might need.

indygreg commented 3 years ago

I just released PyOxidizer 0.11. That means PyOxidizer can track a fork of PyO3 if needed without having to worry about it pushing back the next release by much.

I've also started a local branch to begin the port to PyO3. Should I keep a running list of issues here or would you prefer I do it in PyO3's issue tracker? If the latter, would you prefer a single issue or separate issues per topic?

Some proposed changes I imagine will be a bit controversial and I'd prefer to get the blessing of a PyO3 maintainer before I code up any PRs.

The first major thing I found was various structs in initconfig.rs missing Default implementations (see https://github.com/dgrunwald/rust-cpython/blob/06e61097a798bf9203884d3bb5ffddcf3296a3b2/python3-sys/src/initconfig.rs#L166). A questionable feature. But it helps a lot with ergonomics, as you don't need to type out a few dozen parameter values when constructing instances.

wkschwartz commented 3 years ago

The first major thing I found was various structs in initconfig.rs missing Default implementations (see https://github.com/dgrunwald/rust-cpython/blob/06e61097a798bf9203884d3bb5ffddcf3296a3b2/python3-sys/src/initconfig.rs#L166). A questionable feature. But it helps a lot with ergonomics, as you don't need to type out a few dozen parameter values when constructing instances.

PyPreConfig and PyConfig would be the two that might be eligible from Default impls. Just initializing them to zero, as rust-cpython does, is not sufficient. In the case of PyConfig, it's also undefined behavior (it contains pointers; memsetting a pointer to the zero bit pattern is undefined behavior in C).

More importantly, there are two different defaults, the Python configuration and the isolated configuration. The in-development docs show defaults for all the fields (some of which are not zero!), but some of them, such as PyConfig.configure_c_stdio, depend on which configuration you're using. Hiding that choice behind a Default impl strikes me as being rather unidiomatic in Rust because you would no longer take advantage of the type system's ability to force users to confront the semantics (e.g., Path is based on OsStr not str even though str would be convenient and usually sufficient, but str isn't always sufficient, so Rust provides separate types and forces users to convert explicitly).

Instead you could achieve nearly the same convenience as a Default impl with two constructors. For example, you could add constructors for PyConfig with one that calls PyConfig_InitPythonConfig, and one that calls PyConfig_InitIsolatedConfig.

indygreg commented 3 years ago

The Default impl is definitely a bad practice. And I realized a few minutes after typing the last comment that I can simply use std::mem::zeroed() without support for Default.

The correct way to initialize these data structures is the PyConfig_Init*() functions, as you've noted. However, at the ffi layer you need a pointer to a struct. And Rust makes you initialize fields of a struct. So you need some kind of struct initialization in Rust land. Perhaps the proper thing is a higher-level API that does the std::mem::zeroed() initialization, calls PyConfig_Init*(), and returns the newly-created struct? Or is there a better pattern for this? (Again, I don't think it makes sense to set each struct field because that's the C API's job!)

wkschwartz commented 3 years ago

Perhaps the proper thing is a higher-level API that does the std::mem::zeroed() initialization, calls PyConfig_Init*(), and returns the newly-created struct? Or is there a better pattern for this?

As I said above, I'm concerned that using std::mem::zeroed() for PyConfig is undefined behavior. The better approach is the two constructors I described. For example:

use std::mem::MaybeUninit;  
impl PyPreConfig {
  /// Create a new instance initialized with the [Python configuration].
  ///
  /// [Python configuration]: https://docs.python.org/3/c-api/init_config.html#init-python-config
  fn python() -> Self {
    let mut ppc = MaybeUninit::uninit();
    unsafe {
      PyPreConfig_InitPythonConfig(ppc.as_mut_ptr());
      ppc.assume_init()
    }
  }

  /// Create a new instance initialized with the [isolated configuration].
  ///
  /// [isolated configuration]: https://docs.python.org/3/c-api/init_config.html#isolated-configuration
  fn isolated() -> Self {
    let mut ppc = MaybeUninit::uninit();
    unsafe {
      PyPreConfig_InitIsolatedConfig(ppc.as_mut_ptr());
      ppc.assume_init()
    }
  }
}
indygreg commented 3 years ago

So, I was hoping to be able to incrementally port the code, using microcommits that still compiled and passed the test suite. My initial work was promising albeit ugly (there were lots of marshaling of PyObject* between the 2 crates). I probably got too far into porting without running the tests because when I did things blew up fast.

The crux of the problem is that the cpython crate has a PyObject::drop() that acquires the GIL. And since it is using the RAII GILGuard, the GIL is released after drop() runs. And pyo3 maintains its own state of when it thinks the GIL is held (it also asserts its state in debug builds it appears). So what's happening is that as soon as we have a cpython::PyObject call drop(), the GIL gets released and all hell breaks loose.

I don't believe there is any way I can perform a working incremental port without hacking up the GIL behavior of cpython and/or pyo3. I think Python will let you do things without the GIL held (I don't believe CPython actually validates the GIL is held as much as you think it would). So I'm leaning towards running a fork of pyo3 to disable the GIL assertions. (I have to run a fork anyway until needed patches make it upstream.) That being said, the cpython crate's behavior of taking the GIL during PyObject::drop() feels objectively questionable to me. So maybe that's the best course of action.

@davidhewitt I assume you've seen this story play out before. Do you have any recommendations?

davidhewitt commented 3 years ago

Thanks for the discussion - sorry it's taken me a couple days to reply.


R.e. tracking the port here or in PyO3, I'd suggest tracking the list of changes to PyO3 which PyOxidizer needs in this issue (or a similar one in PyOxidizer) because it's relevant primarily to PyOxidizer rather than PyO3.

However that list of changes should probably link to various relevant issues on the PyO3 bug tracker. For example the discussion above about PyPreConfig is essentially a PyO3 design question. It would be helpful for future users of PyO3 if they can find the design discussions leading to any PRs on the PyO3 bug tracker.


... with the above in mind, I'm going to open an issue on the PyO3 bug tracker shortly linking back to this discussion and we can narrow down a desirable design for PyPreConfig etc. there. I think @wkschwartz is proposing the correct solution, though I have a couple of alternatives which we should weight up against briefly.


@indygreg regarding the cpython::PyObject::drop() issue - do you have a branch pushed which I can play with? I am suprised that there's such a big incompatibility here, because the PyGILState_{Ensure,Release} pair of APIs increment / decrement a "gil count" rather than unconditionally acquiring and releasing the GIL. I would need to check the cpython implementation in case it works differently to what I expect.

I've seen codebases before in the process of migration which didn't have any such compatibility issue. So I'd be very happy to help debug whatever is occurring here.

davidhewitt commented 3 years ago

... created https://github.com/PyO3/pyo3/issues/1474

indygreg commented 3 years ago

Created https://github.com/PyO3/pyo3/pull/1475 for ffi definitions for missing import functionality.

indygreg commented 3 years ago

I just pushed a pyo3 branch (https://github.com/indygreg/PyOxidizer/tree/pyo3) that ports the pyembed crate to PyO3. It compiles and runs most of the pyembed crate tests without issue!

Known issues:

I still have yet to remove the cpython crate dependency and get PyO3 building with a custom config file. Aspects of this may require some new upstream features in PyO3. I won't know until I try. But just getting pyembed to build and pass most tests with PyO3 is a huge milestone and hopefully indicates we're near the finish line!

FWIW the porting process was pretty straightforward, albeit a bit laborious. I did lose an hour or two to debugging a crash related to GILPool attempting to decref some pointers after the interpreter was finalized. I fixed this by ensuring our held GILGuard was dropped before calling Py_FinalizeEx(). I haven't performed any performance comparisons yet: I need to get the PyOxidizer build functionality working first.

davidhewitt commented 3 years ago

Awesome progress! I think you're exercising some less-commonly-used parts of PyO3 - which is great, however I must apologize for the teething problems 😬

I'd love to review the pyo3 branch when it's ready to be put forward as a full PR. I can offer anything I consider idiomatic potentially missed during the porting process. Similarly I can help figure out any config file changes needed.

R.E. performance - I'd hope we're similar to cpython, however I wouldn't be suprised if there's areas where we're a bit behind thanks to https://github.com/PyO3/pyo3/issues/1056. It's a continuous uphill progression; 0.14 is significantly faster than 0.13 in many areas. I think the next steps for PyO3 will be a round of feature work, however I will eventually solve the "GilPool" design in as backwards-compatible way as I can. I'd be very interested to see what your benchmarks are if you make any, and see if exposes quick wins we can make.

rupurt commented 9 months ago

It looks like the PyO3 issues originally linked have been fixed. What does the integration look like currently?