Open steven-hubbard opened 10 months ago
[triage meeting] Drew looked in the linked bundle and noticed that time was spent in kv. Still sending to SQL Foundations since they own view creation.
The statement bundle is in this comment from the slack convo above: https://cockroachlabs.slack.com/archives/C01RX2G8LT1/p1706024952464799?thread_ts=1705432897.726909&cid=C01RX2G8LT1
Here is a screenshot of the slow part of the trace:
In this thread, the theory was that this is from Raft replication, which unfortunately is not well traced.
I will move this issue to the KV team, since to make progress, I think we need more tracing in that area.
The raft delay part is only 90ms in this trace, which doesn't seem too high without additional context. It involves at least 1 RTT and fsync, so can easily take that much.
The issue reports 2s latency, there seems to be no confirmation that most of this time is spent in replication. If it's 90ms out of 2s, then the slowness is somewhere else.
Sorry for the confusion. There are two different executions being discussed in this ticket.
For tracing we have #104035.
Can we get a more representative trace for a higher latency request? I am not convinced by this trace that the 2s problem would be in raft, unless there are some connection or disk issues in this cluster. I'll check debug.zip to see if raft is healthy.
@rafiss I think there are still unclear moments which SQL could help clarify.
Statement bundle of this CREATE VIEW
is consistently showing 200ms latency (we've just tried it another time with @steven-hubbard). However, the SQL command line consistently prints up to 2s for this statement. It's unclear where the other 1.8s are spent. How closely does the Time: 1.778s total
on the screenshot correspond to time measurements in the bundle? Is there any large portion of computation that is missing from the bundle?
Here's the line with verbose_timings enabled:
Time: 1.577294273s total (parse 50µs / plan 15.281ms / exec 27.173ms / schema change 1.431111s / other 103.474ms / network 205.273µs)
It looks like we wait for the schema change to finish (make the descriptor public?), and that isn't recorded in the server-side statistics.
The following thread has more details as well as a tsdump, statement bundle, and debug.zip: Slack Channel Link
When creating a view on the primary key of a table, the view takes almost 2s to complete. The table only has 7 rows and it seems like it should be significantly faster with a where clause specifying the two leading values of the primary key in the view creation. The files above were generated in a three node single region environment run locally on a macbook. The timing is similar in a multi-region cluster.
Jira issue: CRDB-35636