Open belmeopmenieuwesim opened 12 months ago
Hello, I am Blathers. I am here to help you get the issue triaged.
Hoot - a bug! Though bugs are the bane of my existence, rest assured the wretched thing will get the best of care here.
I was unable to automatically find someone to ping.
If we have not gotten back to your issue within a few business days, you can try the following:
:owl: Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.
Hi @belmeopmenieuwesim, sorry for the delay in getting back to you on this issue. If this still reproduces, could you please share statement bundles of the query with and without the WHERE
clause using EXPLAIN ANALYZE (DEBUG)
? If you don't feel comfortable attaching it to this issue, you can share it with me securely via https://upload.cockroachlabs.com/u/harding.
I see from what you've shared that most of the time in the query with the WHERE
clause is spent in KV during the scan, but the statement bundle would have a trace that could help us figure out where the time is being spent.
Also, as a side note, to take advantage of locality optimized search, it's best if the query contains a LIMIT
clause. This isn't very relevant to the issue you're seeing, since the table is empty and even with a limit specified this query would have to access every region to ensure that there were no rows that matched the search.
Describe the problem
My team is evaluatating CockroachDB and while benchmarking CockroachDB
v23.1.8
in a multi region deployment (eu-west, us-west and asia) I noticed that certainSELECT
queries against aREGIONAL BY ROW
table are not always running in parallel when the query spans multiple regions.Adding a
WHERE
clause to a select statement (.e.g.SELECT * FROM users WHERE password = 'test';
) causes a 2.5x performance degradation latency wise. So while the max latency in my database cluster is around186ms
, a SELECT statement with a WHERE clause will take around485ms
to complete.Meanwhile a simple
SELECT * FROM users;
statement completes properly around186ms
.According to the docs, it should also run in parallel for REGIONAL BY ROW tables: https://www.cockroachlabs.com/docs/stable/cost-based-optimizer#locality-optimized-search-in-multi-region-clusters
To Reproduce
Given the following table:
And assuming the following latency table:
When executing the
SELECT * FROM users;
query against any node, it always finishes within183ms
. Looks good, everything happened nicely in parallel:But, when you execute
SELECT * FROM users WHERE password = 'test';
(note the added WHERE clause), the query will always take around485ms
to finish. Clearly not running in parallel in all regions:Adding such
WHERE
clause causes a performance degradation of almost 2.5x! It seems to me this is a bug. It should just fetch all data in parallel matching the WHERE clause from all regions then join them together on the gateway node. If it did that, it should have finished around183ms
as well.The bug does not seem to appear when solely using an unique index in the WHERE clause. If I would for example query
SELECT * FROM users WHERE username = 'user';
then it takes around180ms
as well. Proving the parallelization works fine in that case:Expected behavior I would have expected that a simple
SELECT
statement that has aWHERE
clause on it, would also execute in parallel on all remote regions, so that it finishes around the time that equals one network round trip to the furthest remote node (~186ms
).Extra Information
Environment:
cockroachdb/cockroach
Docker Imagecockroach sql
Add any other context about the problem here.
Jira issue: CRDB-31312