Open aljazerzen opened 8 months ago
This is also needed because right now, it doesn't make sense making any performance optimizations, since I have no way of verifying that they are even any faster.
There is something similar here: https://github.com/pola-rs/tpch
Ideally, I would be able to benchmark "getting X amount of data from a data store Y into memory", for a few different amounts of data X and for all supported data stores Y.
ATM I have no idea how performant this connector is. Surly it is slower than using plain connections to the database and not converting to Arrow at all.
But how does it compare to:
pandas.read_sql
,polars.read_database
,connectorx
,ADBC
,dyplr
?Ideally, I would reuse the benchmarks from the connector-x project, but I'm not sure how portable they are.