perf: benchmarking - Githubissues

aljazerzen / connector_arrow

Apache Arrow database client for many databases.

https://docs.rs/connector_arrow

MIT License

39 stars 2 forks source link

perf: benchmarking #10

Open aljazerzen opened 8 months ago

aljazerzen commented 8 months ago

ATM I have no idea how performant this connector is. Surly it is slower than using plain connections to the database and not converting to Arrow at all.

But how does it compare to:

pandas.read_sql,
polars.read_database,
connectorx,
ADBC,
dyplr?

Ideally, I would reuse the benchmarks from the connector-x project, but I'm not sure how portable they are.

aljazerzen commented 8 months ago

This is also needed because right now, it doesn't make sense making any performance optimizations, since I have no way of verifying that they are even any faster.

aljazerzen commented 7 months ago

There is something similar here: https://github.com/pola-rs/tpch

Ideally, I would be able to benchmark "getting X amount of data from a data store Y into memory", for a few different amounts of data X and for all supported data stores Y.