fivetran / benchmark

Benchmark data warehouses under Fivetran-like conditions
161 stars 42 forks source link

Examine performance under concurrency #2

Open russellpierce opened 6 years ago

russellpierce commented 6 years ago

Realistic scenarios frequently involve multiple analysts using the same db at the same time. Examine performance with 6+ queries going at once. Under Redshift's default options this will likely result in issues (http://docs.aws.amazon.com/redshift/latest/dg/cm-c-defining-query-queues.html) that shouldn't be experienced by BigQuery and experienced to a lesser extent on Snowflake.

georgewfraser commented 6 years ago

I expect Redshift's performance to be pretty linear under concurrency---that's what I've seen in other warehouses, and it makes sense given how MPP query planners work. But we should quantitatively evaluate the linearity of each warehouse and report it.

russellpierce commented 6 years ago

I think you'll be horrifyingly surprised when it comes to Redshift - maybe the dedicated storage resources per shard - it does seem particularly worse on dense storage.

georgewfraser commented 4 years ago

In addition to concurrent readers, it would be good to have concurrent writers, representing an ETL process that is appending, deleting, and updating at the same time that you are querying.