Open MartinKolbAtWork opened 2 months ago
I'll take a look on how an api should look like to expose and register an external handler through python
@MartinKolbAtWork it seems that this might be not possible or quite complex, I at least can't find any docs in Pyo3 to achieve this
Do you have the SAP BTP Object store published somewhere?
Hi @ion-elgreco , Thanks for looking into this. The integration for SAP BTP is currently used internally at SAP and might be published later, however currently I cannot share the code. It’s actually using “SAP Data Lake Files” (https://help.sap.com/docs/hana-cloud-data-lake/user-guide-for-data-lake-files/understanding-data-lake-files) as object storge, which is accessible via SAP’s Business Technology Platform (BTP, https://www.sap.com/products/technology-platform.html).
I also investigated a possible solution and it’s especially challenging because the shared library packaged with the Wheel of deltalake-python would need binary compatibility with the shared library that would be packaged with the “add-on”. Ensuring the binary compatibility between these libraries (e.g. related to the used Rust version and the used version of the deltalake crate) would be hard to achieve. An approach that “tunnels” all calls between the two Rust libraries over Python could mitigate the binary compatibility issues but would probably suffer from poor performance.
Hello, I'm from the OpenDAL community, which aims to provide storage access to various services in multiple languages. Perhaps we can build something extensible to allow us to integrate with more storage services easily.
Tools we have now:
object_store
support via opendal.@Xuanwo hey, I wasn't aware that opendal has an objectstore Impl, that's useful!
Any help on this is much appreciated :)
Description
The Python binding has a hard-coded list of deltalake handlers that are registered: https://github.com/delta-io/delta-rs/blob/fcd62ab10be9545eab296f03ae0e4324fc4be6cb/python/src/lib.rs#L2035-L2039
To add support for another object store (SAP BTP) we have a Rust crate available, we did not find a way to register these handlers onto the already existing Python binding. The shared library that comes with deltalake-python does not expose an entry point for adding new object stores. We ended up in forking delta-rs and adding the registration call as another line in the list above. But we don’t think using a fork to add an additional object store is an appropriate approach.
Have we missed something here? Shouldn’t there be a way to add additional stores in addition to the 5 existing ones?