apache / kyuubi

Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
https://kyuubi.apache.org/
Apache License 2.0
2.07k stars 902 forks source link

Publish benchmarks #607

Open MrPowers opened 3 years ago

MrPowers commented 3 years ago

1. Describe the feature

Create benchmarks to quantify how kyuubi can speed up queries (these might be really hard to setup and understand this might not be realistic).

2. Motivation

If we can show kyuubi speeds up workflows then it'll be easier to get users!

3. Describe the solution

Setup a "typical environment" that organizations have. We'll want this environment to be realistic and demonstrate the challenges that this typical environment can't handle (permissions, tuning, load balancing, etc.).

Setup a similar environment with kyuubi to demonstrate how this framework solves core business problems facing the organization. The benchmarks should demonstrate that speed improvements are one of the benefits.

4. Additional context

This request might not be realistic because spinning up entire environments is costly and takes a lot of time.

That said, think there would be a big benefit of having these setup for user adoption. If I can create a videos and show people how kyuubi solves a lot of their organizational challenges, then they'll have a compelling reason to start using the framework.

yaooqinn commented 3 years ago

This is a nice suggestion. At NetEase, we are in the middle of helping one of our inner customers port their legacy Hive queries(5k~7k) directly to Kyuubi. The current numbers show that they save both 50% compute resources and 70% total time cost at the same time. If converted to performance gains, that would be about a 6 to 8 times improvement. I hope it can be summed up as a success story ASAP.

It would be also nice that we can do a benchmark to compare Kyuubi, STS, and HiveServer2

yaooqinn commented 3 years ago

also cc @turboFei @pan3793 @zwangsheng @ulysses-you

MrPowers commented 3 years ago

@yaooqinn - those are some juicy performance gains!

Send me a link to that case study after you write it up and I'll take a look.