Closed vishwamartur closed 3 days ago
💵 To receive payouts, sign up on Algora, link your Github account and connect with Stripe.
Thanks for the PR. We will be reviewing it shortly.
Hi @vishwamartur,
Thanks for the PR. I am checking with our engineering team to see who will be the best person to look into the implementation details. What I am expecting
I will arrange some blog/video around the ADBC/Arrow support when the PR is merged.
Hope it makes sense, and feel free to let us know your thoughts.
Hi @jovezhong,
Thank you for the detailed feedback and suggestions!
To start, we’d like to focus on fully implementing and stabilizing the ADBC driver support in C++. Once the C++ implementation is complete and meets the required performance and functionality benchmarks (e.g., large result set handling, streaming SQL), we can then plan to extend support to other languages like Go, Java, Python, and R.
This phased approach will allow us to ensure a solid foundation before expanding to other ecosystems. Let me know if this sounds good, or if you have any immediate priorities that require parallel development in other languages.
Thanks! @vishwamartur
Sounds good. Let's have the C++ driver has the 1st feature-complete ADBC driver, then expand to more languages. From high priority to lower: C++ > Java > Python > Go. You don't need to work on R adapter. Ideally we contribute the ADBC driver for Timeplus, similar to https://arrow.apache.org/adbc/current/driver/postgresql.html
Looking at this, this doesn't appear to actually have much to do with ADBC in anything but name. Does Timeplus already support Arrow FlightSQL? If so, then there's nothing that needs to be done as all of the ADBC bindings would be able to use the FlightSQL driver to connect query data from any one of multiple languages (Go, C++, C, Python, R, Rust, Java, etc.)
If Timeplus doesn't already support FlightSQL, then you need to implement the ADBC C interface to create a driver, ideally as a shared object library that can be separately distributed as a client rather than built into Timeplus directly. I can help with that if needed.
Thanks Matt for the comment. Today in Timeplus Proton server we don't have FlightSQL built-in. I leave more discussions between you and @vishwamartur
To be clear, we want ADBC support more than FlightSQL.
I just want to clarify: @vishwamartur is the goal here to have an ADBC driver to connect to Time plus with? Or for Time plus to connect to other sources via ADBC? That will affect what is expected to be implemented here.
@jovezhong i just to be clear, if Timeplus exposes a Flight SQL server for connectivity, you would get ADBC support for free via the flight SQL ADBC (and ODBC/JDBC) driver that already exists.
That said, I believe you already are built on ClickHouse, so it shouldn't be too difficult to create an ADBC driver which can use the ClickHouse protocol for connecting and retrieving Arrow formatted data, right?
Hi @zeroshade,
Thanks for the clarification! The goal is to create an ADBC driver for clients to connect to Timeplus. Leveraging the ClickHouse protocol to retrieve Arrow-formatted data makes sense, given our architecture.
If you have any specific suggestions for implementing the ADBC C interface or designing the driver as a shared library, I’d greatly appreciate it.
Looking forward to your thoughts!
Best,
Vishwa
@vishwamartur I might have missed something, but looking at the PR, I don't see how this can let someone create a ADBC driver to connect to timeplus proton. Could you help me to understand how this works, please?
@jovezhong i just to be clear, if Timeplus exposes a Flight SQL server for connectivity, you would get ADBC support for free via the flight SQL ADBC (and ODBC/JDBC) driver that already exists.
That said, I believe you already are built on ClickHouse, so it shouldn't be too difficult to create an ADBC driver which can use the ClickHouse protocol for connecting and retrieving Arrow formatted data, right?
@zeroshade yes, proton also has the arrow format support as ClickHouse does, but there are gaps as the implementations are not up-to-date with the ClickHouse repo at the moment. This might or might not have impact on implementing an ADBC driver ( I don't now much about implementing an ADBC driver ). I don't know if ADBC interface supports streaming already, since proton is a streaming data engine, this is one thing to pay attention to when implementing a database driver for it.
@zeroshade, could you please suggest any changes?
@zliang-min, if I’m mistaken, I would appreciate your guidance and suggestions for improvements. I’ll do my best to implement them.
@vishwamartur to achieve the goal of being able to connect to timeplus proton via an ADBC driver, there are two options:
The second option allows the maximum availability and makes it easier to integrate with the existing ecosystem. The first option is probably easier, but it has big limitations ( it limits what languages can be used, and it's hard to utilize what are already there in the ecosystem ).
Hopefully this makes sense.
The second option allows the maximum availability and makes it easier to integrate with the existing ecosystem. The first option is probably easier, but it has big limitations ( it limits what languages can be used, and it's hard to utilize what are already there in the ecosystem ).
It actually doesn't limit the languages as much as you'd expect. For example, the current ADBC FlightSQL driver is implemented in Go and distributed as a C shared object that can be loaded by ADBC driver managers. If you implement the Go ADBC Interface, then it's a simple case to use the existing SDK to create a distributable driver that can be easily loaded by any ADBC driver manager.
@zeroshade, could you please suggest any changes?
I would argue that ADBC Driver
belongs in the same box as SDK, JDBC/ODBC
and Data/BI Connectors
. An ADBC driver is just another driver, similar in concept to a JDBC or ODBC driver (but columnar and Arrow-native instead of row-oriented).
I’ve made the changes in this pull request. Could you please review them and share your suggestions? I’m happy to make any necessary updates.
Related to #276
Add support for ADBC (Arrow Database Connectivity) driver for Arrow Flight SQL.
ADBC Driver Implementation
src/Processors/Formats/Impl/ADBCDriver.cpp
to implement the ADBC driver for Arrow Flight SQL.src/Processors/Formats/Impl/ADBCDriver.h
to declare the ADBC driver class and its methods.Configuration
src/configure_config.cmake
to include ADBC driver support and necessary libraries.Documentation
README.md
to include information about ADBC driver support, examples, and usage instructions.Testing
tests/ADBCDriverTest.cpp
to implement tests for the ADBC driver, including connection and query execution tests./claim #276