There are latency issues with the Databases page that can cause timeouts and stats to be 'unavailable' for some time - this occurs for larger clusters with many nodes and ranges.
Today, customers get around these console scale issues by scraping our system tables to their downstream logging tools. We should improve the console so that it can support scaled workloads, avoiding customers having to build this workaround.
This issue tracks potential improvements to address slow requests in the database pages. Note that this is not a holistic list, and an investigation is ongoing into which potential improvements should be prioritized.
[ ] When a request timeout is hit we lose the entire response. Currently we batch a lot of queries for databases info into a single sql over http request. We should separate any slower query into its own request so we at least get fast returning db details quickly.
[ ] Requesting span stats can be an expensive operation when there are lots of ranges, the response containing db's span stats is under utilized. The only fields used for the overview page are approximate_disk_bytes and range_count. The range count can be obtained using the alternative query SELECT count(*) FROM [SHOW RANGES FOR DATABASE 'db']
[ ] Additional to the above, approximate_disk_bytes is not part of MVCC stats calculation. When requesting span stats, we can provide a flag to skip MVCC stats calculation. If we're just requesting approximate_disk_bytes for the overview page we can skip calculating mvcc stats. This improvement needs to tweak the builtin being used to accept this flag to plumb down to the server requests.
There are latency issues with the Databases page that can cause timeouts and stats to be 'unavailable' for some time - this occurs for larger clusters with many nodes and ranges.
Today, customers get around these console scale issues by scraping our system tables to their downstream logging tools. We should improve the console so that it can support scaled workloads, avoiding customers having to build this workaround.
This issue tracks potential improvements to address slow requests in the database pages. Note that this is not a holistic list, and an investigation is ongoing into which potential improvements should be prioritized.
approximate_disk_bytes
andrange_count
. The range count can be obtained using the alternative querySELECT count(*) FROM [SHOW RANGES FOR DATABASE 'db']
approximate_disk_bytes
is not part of MVCC stats calculation. When requesting span stats, we can provide a flag to skip MVCC stats calculation. If we're just requestingapproximate_disk_bytes
for the overview page we can skip calculating mvcc stats. This improvement needs to tweak the builtin being used to accept this flag to plumb down to the server requests.Jira issue: CRDB-36457
Epic CRDB-37558