Closed xinhaoz closed 1 year ago
I'm going to include/address the following issues as a part of this:
@j82w Great questions --
Why not make the default to load the in memory first to get the page to load the fastest and then in the background pull persisted stats? That way it's always the same workflow.
In a lot of our default time picker cases we'll be fetching in-memory stats along with the flushed data already. I think it's best not to always incur an additional request if the sql stats system is performing alright. This feature is moreso to offer a reliable alternative while we stabilize sql stats.
Won't the exact time cause confusion because different nodes will flush to persisted table at different intervals, and some nodes might even get restarted? Could the date time picker have a in memory option?
I think just labeling 'in memory' might be more confusing since they won't know what the date range is. The flush interval is a cluster setting used by all nodes, so my thinking here was just to surface what that interval is so we could show that when we surface the in-memory option (e.g. 'Last 10 Minutes').
WDYT?
In a lot of our default time picker cases we'll be fetching in-memory stats along with the flushed data already. I think it's best not to always incur an additional request if the sql stats system is performing alright. This feature is moreso to offer a reliable alternative while we stabilize sql stats.
I was hoping this feature would also reduce the initial page load times for everyone and not just the customers with large datasets. If we want to focus on just getting the one scenario to work that is fine.
I think just labeling 'in memory' might be more confusing since they won't know what the date range is. The flush interval is a cluster setting used by all nodes, so my thinking here was just to surface what that interval is so we could show that when we surface the in-memory option (e.g. 'Last 10 Minutes').
I'm trying to think of a way to convey that it's not all data from the last 10 minutes, because different nodes will have different times they flush the data. If it just says 'Last 10 Minutes' we will likely get customer issues asking why they don't see a query that ran 2minutes ago because the flush occurred 1 minute ago. @kevin-v-ngo and/or @dongniwang do you have any suggestions?
Data point for system.statement_statistics
table with 118k rows:
select * from system.statement_statistics order by statistics->'execution_statistics'->>'cnt' limit 40;
(select * from system.statement_statistics
order by statistics->'execution_statistics'->>'cnt' limit 40)
union
(select * from system.statement_statistics
order by statistics->'execution_statistics'->'contentionTime'->>'mean' limit 40)
union (
select * from system.statement_statistics
order by statistics->'execution_statistics'->'cpuSQLNanos'->>'mean' limit 40)
union (
select * from system.statement_statistics
order by statistics->'execution_statistics'->'rowsRead'->>'mean' limit 40)
union (
select * from system.statement_statistics
order by statistics->'execution_statistics'->'runLat'->>'mean' limit 40)
union (
select * from system.statement_statistics
order by statistics->'execution_statistics'->'svcLat'->>'mean' limit 40)
Data point for crdb_internal.cluster_statement_statistics
table with 100k rows:
Taking the above numbers into consideration would a better solution be to:
This solution would allow users to select time frames and avoids the overhead of sending all the data back to the ui to do the computation. The trade off is that each time there is a sort/filter it will take longer.
WDYT?
Adding onto the above -- posting from slack for tracking.
I spoke with @j82w on Friday who initially had some concerns about the in-memory fallback, and after doing some testing of my own with lots of in-memory data (~40k unique fingerprints), I also now feel like providing the in-memory option may not be the reliable solution we want, rather I feel like we should actually do the opposite -- default to not reading from in-memory stats and querying from the system table only (with the new limit and order by features to get interesting statements). When we have a lot of in-memory data (which I assume customers running into the loading issue often do, since the default is past 1 hour) it seems to cause all sorts of problems --
Given the virtual table generation and merging weith in-memory stats seems to the be bottleneck in seemingly problematic workloads (ones with high unique fingerprints), how does product feel about not including in-mem stats by default? Our flush interval is already set to 10 minutes, which is pretty frequent. One would then need to wait 10 minutes to see any data appear (after resetting). If it's important to surface that data right after resetting / starting the cluster, maybe an idea there is if we don't return anything from the system table, to then query from in-memory in the interim (we'd still need to make improvments on our in-memory response time to service that reasonably). With these changes I think we could reliably show data as long as there have been some workload data flushed to disk, which is easier to guarantee.
The in-memory results have the following issues.
Here is a rough diagram explaining the process. Let me know if you have any suggestions or notice any errors.
sequenceDiagram
title crdb_internal fanout logic
participant user
participant sql
participant crdb_internal
participant SQLStatusServer
participant Local
user->>sql: Request crdb_internal.cluster_statement_statistics
sql->>crdb_internal: Generate the view
crdb_internal->>SQLStatusServer: Requests all stats. No filters or limits are passed in.
SQLStatusServer->>Local: No filters
activate Local
Local-->>SQLStatusServer: Return up to 100k
deactivate Local
loop All nodes in cluster. Up to 100 parallel
SQLStatusServer->>Node2: No filters
activate Node2
SQLStatusServer->>Node3: No filters
activate Node3
SQLStatusServer->>Node4: No filters
activate Node4
Node2-->>SQLStatusServer: Return up to 100k
deactivate Node2
Node4-->>SQLStatusServer: Return up to 100k
deactivate Node4
Node3-->>SQLStatusServer: Return up to 100k
deactivate Node3
end
SQLStatusServer-->>crdb_internal: Return up to 100k
crdb_internal->>crdb_internal: Generate the view with 100k rows
crdb_internal->>sql: Return 100k row view
sql->>sql: Filter/sort rows
sql->>user: Results
On our stmt and txn fingerprints pages, we have seen increasingly long loading times when requesting persisted sql stats for clusters with large amounts of data. This can lead to the page being unusable, as the minimum time range we allow to be requested is the past 1 hour.
We need to provide a fallback so that users can see at least some of their live workload.
Proposed experience:
cc @kevin-v-ngo , @dongniwang
EDIT - NEW APPROACH
We will be reading only from the system table (see discussion below), as the in-memory stats are actually the source of a lot of our response time issues. Limit and sort options will be added as a part of this plan.
Jira issue: CRDB-25071