PyO3 / pyo3

Rust bindings for the Python interpreter
https://pyo3.rs
Apache License 2.0
12.12k stars 745 forks source link

FFI module cleanup #1289

Open davidhewitt opened 3 years ago

davidhewitt commented 3 years ago

The modules in src/ffi have become a bit out-of-date, and also quite disorganised so it's kinda hard to figure out what's missing.

Since Python 3.9 the cpython include files have been split so that (roughly) the limited api is in cpython/Include/ and the unlimited api is in cpython/Include/cpython/. I propose for simplicity in maintaining the pyo3::ffi module we should reorganise code to match that structure.

I suggest that we do the following things:

The idea with each // skipped foo being in alphabetical order is that it's then hopefully easier for us to compare against upstream in the future.

With the new cpython folder for the unlimited api, it might be reduce a lot of #[cfg(Py_LIMITED_API)] chatter as we'll only need that once on the mod cpython definition.

Also, it could be interesting to not have pub use self::cpython::* in ffi/mod.rs, so that users are forced to use unlimited api symbols as ffi::cpython::foo. But that's probably quite breaking, so I can go either way on it.

Perhaps in general the above guidelines of how to organise our ffi modules should go in CONTRIBUTING.md ?

nw0 commented 3 years ago

I've started implementing this, and ran into a couple of questions:

  1. Do we want to mark the Py_DEPRECATED functions in any way? I'm not sure whether this is useful or desired in PyO3, but this would be a convenient time to do it.
  2. There are a good number of ifndef Py_LIMITED_API blocks in the cpython/Include/*.h headers. I'm inclined to leave them in place for now: is this desired? The best thing to do might be to file PRs against cpython to move them where possible, I guess.
davidhewitt commented 3 years ago

@nw0 thanks so much for being willing to help us out! In response to your questions:

  1. With Py_DEPRECATED functions I think:

    • any we currently have ffi bindings for, we should add #[deprecated()] also.
    • any that we are missing, let's skip adding.
  2. With #ifndef Py_LIMITED_API inside the cpython/Include/*.h headers - that's quite suprising as I thought that those headers are only #include-d if it's not Py_LIMITED_API. I guess we'll have to match anything that cpython does with those definitions. If you want to file upstream PRs against cpython, the whole ecosystem will be extremely grateful for you helping us to keep things tidy!

P.S. I can imagine that this whole ffi cleanup will be quite a big PR. For ease of reviewing, and maybe also so that I and others can join in with the cleanup effort, I would suggest you open PRs with just a few files in it at a time; that way we can merge things as we go.

Again, thank you so much! ❤️

nw0 commented 3 years ago

A number of the comments read:

needs adjustment for Python 3.3 and 3.5

https://github.com/PyO3/pyo3/blob/3f093d9e59df0140f6c3a0abbb6acaa439687b7f/src/ffi/mod.rs#L143

As we don't support Python 3.5 any longer (let alone 3.3), shall I remove those? Asking as I'm not sure what it means.

kngwyu commented 3 years ago

I was thinking of removing those comments and appreciate your help. 👍🏼

jeet-parekh commented 3 years ago

I've been exploring the code under src/ffi, and also the CPython header files. I have a question.

I noticed that sometimes the #defines from the C header files are implemented in src/ffi, and sometimes they aren't. And sometimes if they are implemented, then they're not being used anywhere. The typedefs are mostly implemented. If not, they are commented as skipped.

My question is, what's the thought process behind what should and shouldn't be added to the FFI? I'm mostly confused by the #defines. The typedefs are pretty clear to me. I am a complete beginner to rust, so apologies if I may have missed something obvious.

davidhewitt commented 3 years ago

Great question. In general we need to include all the #defines from the C header files in src/ffi. As they're C macros they won't be exported in the libpython symbol table; instead we would to re-implement them.

The code in src/ffi is for the whole ecosystem, not just PyO3 - so even if the PyO3 crate doesn't use them, downstream crates might. (Once day I hope to have good test coverage of all of src/ffi, but I think that's a long way off.)

jeet-parekh commented 3 years ago

Thanks! I guess implementing the structs would be a good way to get started. And if they use any macros or other constant #defines, then I include that as well. The other macros that are not being used might be hard to test. So I'll skip those. Any thoughts?

davidhewitt commented 3 years ago

Sounds great! If you're looking for examples to refer back to, this series of PRs by @nw0 which has been some prior work in this area are excellent: https://github.com/PyO3/pyo3/pulls?q=is%3Apr+author%3Anw0+is%3Aclosed

jeet-parekh commented 3 years ago

Still exploring the FFI code slowly. I have a few more questions. Would appreciate some pointers.

  1. Some functions are implemented as unsafe. https://github.com/PyO3/pyo3/blob/b3659692cd9fb894ba2b79df252bdd9115179091/src/ffi/datetime.rs#L235-L237 And some are not unsafe. https://github.com/PyO3/pyo3/blob/b3659692cd9fb894ba2b79df252bdd9115179091/src/ffi/datetime.rs#L317-L324

    What's the thought process behind that? I have two guesses - (1) C macros are implemented as unsafe in rust, or (2) the functions which use pyobject are unsafe. Or is it something that I have completely missed?

  2. Why are these functions inside Option? Is it simply because they're function pointers? https://github.com/PyO3/pyo3/blob/b3659692cd9fb894ba2b79df252bdd9115179091/src/ffi/pymem.rs#L39-L47

davidhewitt commented 3 years ago

Some functions are implemented as unsafe. [...] And some are not unsafe.

So in general all of these ffi methods are unsafe to use, for two main reasons:

The functions you note in your second example are actually still unsafe, because they're inside an extern "C" block. The Rust compiler assumes all functions implemented in C are unsafe 😄

Why are these functions inside Option? Is it simply because they're function pointers?

Precisely - function pointers in C are nullable, however Rust's fn () is not. Note that this is quite important to get right - I've written bugs in the past where I forgot to wrap in Option. This allowed the optimizer to assume that the value could never be null, which led to some nasty UB!

deantvv commented 2 years ago

Here is something I wish I knew when I first started here and hope sharing here will help new contributor start more easily.

  1. For PyPy, you can check the link name with nm -D ${PYPY_INSTALL_BASE}/lib/libpypy3-c.so | grep 'Pyxxxxx' And add the link name corresponding to what you find in nm. It is usually something like PyPyxxxxx Example:
    nm -D $PYPY_INSTALL_BASE/lib/libpypy3-c.so | grep PyDict_New
    # 00000000018ef450 T PyPyDict_New
extern "C" {
    #[cfg_attr(PyPy, link_name = "PyPyDict_New")]
    pub fn PyDict_New() -> *mut PyObject;
}
  1. For macro in C, we usually rewrite this as a #[inline] function
  2. Only add #[cfg_attr(windows, link(name = "pythonXY"))] to pub static mut Pyxxxx_Type declaration
nw0 commented 2 years ago

I don't have time to make a PR right now, but wanted to note that last week some of the header files have moved in CPython again.

There are a few changes, but this one in particular makes some definitions "internal-only": https://github.com/python/cpython/pull/28957