earth-mover / icechunk

Open-source, cloud-native transactional tensor storage engine
https://icechunk.io
Apache License 2.0
291 stars 17 forks source link

Update to pyo3 0.22 #362

Closed kylebarron closed 3 weeks ago

kylebarron commented 3 weeks ago

For https://github.com/earth-mover/icechunk/issues/266.

Change list

mpiannucci commented 3 weeks ago

Thanks for this!!

but I'd suggest storing Rust error types in your Rust error enum, and then converting to PyErr as needed in the From impl.

Can you expand on how this is different from what we are doing now?

kylebarron commented 3 weeks ago

Can you expand on how this is different from what we are doing now?

In the case of KeyNotFound and PyValueError: https://github.com/earth-mover/icechunk/blob/5aa9abeae3bb22deffa99ab30aa085233678bc0c/icechunk-python/src/errors.rs#L18-L19 https://github.com/earth-mover/icechunk/blob/5aa9abeae3bb22deffa99ab30aa085233678bc0c/icechunk-python/src/errors.rs#L26-L27 the #[from] thiserror macro I believes expands to hold those error types within the variant.

This means that you're storing a Python object within your Rust enum. (Each of those two rust structs wrap Python objects, a PyAny).

I think it can make sense to have one catch-all PyError variant in your enum, so that if you're in a place where you think "I really want a Python IOError", you can create a PyIOError and store that in the PyIcechunkStoreError::PyError variant.

But in the case of KeyNotFound and PyValueError you're currently getting a sort of nested error description, where the error content is getting expanded into a string, but then all of your errors are exposed to Python as a ValueError. That seems to not be what you want because you create your own custom exception type but it doesn't look like Python code will ever see it.

mpiannucci commented 3 weeks ago

This makes sense.

I am going to merge this in and do a follow up PR with the python error handling updates.

Thank you for this!