PyO3 / pyo3

Rust bindings for the Python interpreter
https://pyo3.rs
Other
11.47k stars 694 forks source link

Added `ToPyObject` and `IntoPy<PyObject>` impls for `PyBackedStr`. #4205

Closed JRRudy1 closed 1 month ago

JRRudy1 commented 1 month ago

I have a PyBackedStr that I extracted from a PyObject, and would like to pass it back to Python through another method call. However, PyBackedStr currently does not implement the conversion traits necessary to do so without cloning. This PR simply adds these conversion methods, which is trivial since the PyBackedStr is already storing a pointer to the Python string as a PyObject.

codspeed-hq[bot] commented 1 month ago

CodSpeed Performance Report

Merging #4205 will not alter performance

Comparing JRRudy1:pybackedstr-intopy (43fb3f7) with main (cb34737)

Summary

✅ 67 untouched benchmarks

JRRudy1 commented 1 month ago

I actually just noticed that the conversion is not as trivial when cfg(not(any(Py_3_10, not(Py_LIMITED_API)))) since the backing PyString gets converted to a PyBytes when the PyBackedStr is created; this isn't necessarily a problem, but means the object passed back to Python would be bytes instead of str as expected. I just added attributes to limit these impls to the trivial case, but maybe this is not the best solution. I see three options:

  1. limit the impls as I just did
  2. allow the impls unconditionally, accepting that the Python object will be bytes in some cases
  3. flesh out the impls to handle the limited-API case by converting back to PyString; I would need some advice on the correct way to do this.

Regarding the CI, do you have any tips on why it would be failing and what I can do to fix that? The change seems too small to actually effect anything

davidhewitt commented 1 month ago

https://github.com/PyO3/pyo3/actions/runs/9238604511/job/25416822459?pr=4205#step:7:23

It looks like cargo fmt will rearrange the imports.

davidhewitt commented 1 month ago

3. flesh out the impls to handle the limited-API case by converting back to PyString; I would need some advice on the correct way to do this.

Something like PyString::new(py, self) probably works, because self as a PyBackedStr will deref to a str.

JRRudy1 commented 1 month ago

Something like PyString::new(py, self) probably works, because self as a PyBackedStr will deref to a str.

I guess I was overthinking it because I was expecting to need some ffi wizardry like PyUnicode_FromString or something haha. But is my understanding correct that the limited-API case will involve copying the whole string, while the other case simply copies a reference to the original string?

I updated the impls and will look into adding some tests.

JRRudy1 commented 1 month ago

I went ahead and added ToPyObject and IntoPy<PyObject> impls for PyBackedBytes as well, but it is a bit unclear whether the PyBackedBytesStorage::Rust variant should covert to PyBytes or PyByteArray. Mirroring the FromPyObject impl suggests PyByteArray so this is what I went with, but now I'm questioning whether it would be more intuitive to consistently return PyBytes in both cases. The data is getting copied either way so I don't see a technical reason to prefer one over the other.

What do you think @davidhewitt?

I also added a couple tests, but let me know if you have any suggestions to improve them.

JRRudy1 commented 1 month ago

I changed my mind and updated the PyBackedBytes conversion impls to return a Python bytes object regardless of the backing storage type. My logic is that the storage type is an implementation detail, and the user shouldn't have to know what it is in order to predict the resulting Python type. The Rust type is called "PyBackedBytes", so intuitively the Python type should be bytes as well. The only weird aspect is that the round-trip conversion would be PyByteArray -> PyBackedBytes -> PyBytes instead of returning to PyByteArray where it started, but I think that's a worthy tradeoff.

JRRudy1 commented 1 month ago

I think one of the runners crashed and the checks need to be restarted