sql: prevent full table scans

awoods187 commented 4 years ago

Some customers have proposed preventing a “Blast radius” from accidentally running expensive queries.

One proposal is adding them to the slow query log. Another is to add a setting to disable unconstrained index scans. Note that these expensive queries are not necessarily super slow.

Recently, we observed a customer frustrated about a query finishing in 2s because it was bringing the cluster to its knees. Sometimes the problem is caused by a missing index. Sometimes the problem is caused by mis-optimization.

arulajmani commented 4 years ago

Looking for more clarification on the scope of this issue. Are we looking to implement anything in addition to logging such queries to the slow query log?

@andreimatei pointed me to this slack thread that talks about the nuances of what is meant by a full table scan. In my implementation for #45616, I don't discriminate between a full index scan and a scan on the PK. I guess that is okay when we're simply logging queries to the slow query log, but what happens if we add a setting to completely disable it?

If we do want to add the setting and call out the nuances above, I'd like to get some more information about the setting. Do we want a cluster wide setting or a session specific setting? Is it opt in or opt out?

arulajmani commented 4 years ago

Currently working on logging full table/index scans to the slow query log. I have some open questions:

Do we want to log both full table scans (primary index scans) and secondary index scans? Should these be logged under different labels or does it not matter?
Currently, the slow query log the "log trigger" which is an audit event (READ/READWRITE). It seems weird logging this in the slow query log as well as the audits log. Would it make sense to replace this column with a "slow query log reason"? I was thinking MECE reasons: {LATENCY THRESHOLD + FULL TABLE/INDEX SCAN}, {LATENCY THRESHOLD}, {FULL TABLE/INDEX SCAN}to allow users to figure out why a query was logged. Alternatively, we could keep the audit event column and add the "slow query log reason" as an additional column.

cc @piyush-singh @solongordon

piyush-singh commented 4 years ago

Do we want to log both full table scans (primary index scans) and secondary index scans?

I think including both would be a good idea. Drawing on MySQL for inspiration - they explicitly call out that they log any queries that return all rows regardless of whether a secondary index was scanned or not:

If you enable this variable with the slow query log enabled, queries that are expected to retrieve all rows are logged. See Section 5.4.5, “The Slow Query Log”. This option does not necessarily mean that no index is used. For example, a query that uses a full index scan uses an index but would be logged because the index would not limit the number of rows.

Should these be logged under different labels or does it not matter?

Logging these under different labels makes sense to me - it may help a user understand that they have an index that was used in the query but incorrectly designed.

Would it make sense to replace this column with a "slow query log reason"?

I like this approach - the three reasons make it extremely clear why something was logged. As far as replacing the log trigger - I think @jordanlewis added the slow query log, he might have insight into why the log trigger column was included.

awoods187 commented 4 years ago

I'm a +1 on what Piyush said above about the slow query log. We certainly want the column to have the reasons you mentioned above but we should check if the existing column has some other purpose. If it does, we can append this column to the end of the current log design.

I probably should have split this into two issues, but we also want to make a cluster setting and session variable that prevents the cluster from running a query that scans all rows (full table scan + secondary index scan).

If we do want to add the setting and call out the nuances above, I'd like to get some more information about the setting. Do we want a cluster wide setting or a session specific setting? Is it opt in or opt out?

We want both a cluster-wide setting and a session-specific setting. It is opt-in. We only want people who are expressly opting into this behavior to receive it because full table scans are not always bad. We need to have guidance around this.

andreimatei commented 4 years ago

Are you guys really sure that the slow query log is what you want here? Cause these queries are not necessarily slow, and they have nothing to do with sql.log.slow_query.latency_threshold. If I have a table with one row that I keep reading, do I need to see that in this log over an over again? Or think of the queues people implement with a table of "pending" items to process. These cases can easily drown out the useful info in the slow queries log - the actually slow queries. Planning a full table scan is a mostly-static property of the query. Do I really need to see the same query over and over again in the log or would I be better served by some sort of a fingerprint+count/recency?

More narrowly, I don't see a discussion here about non-NULL constraints on scans. Does such a scan qualify as full? According to #51441, I guess it doesn't. But shouldn't it?

arulajmani commented 4 years ago

I have a table with one row that I keep reading, do I need to see that in this log over an over again?

It's worth noting this is an opt-in setting that also logs the service time of the query -- having a log entry doesn't imply the query is necessarily bad, it's just to provides observability into queries that could be bad. Additionally, there's a new "slow query log reason" which identifies why a query was logged in the slow query log. This allows users to filter for queries that are over the latency threshold and contain an unconstrained scan if they want.

Planning a full table scan is a mostly-static property of the query. Do I really need to see the same query over and over again in the log or would I be better served by some sort of a fingerprint+count/recency?

Ack.

Does such a scan qualify as full? According to #51441, I guess it doesn't. But shouldn't it?

Just to make sure I understand this correctly, are you saying we should decouple the notion of a full table scan that dist sql uses to recommend if the plan should be distributed from what qualifies as a full table scan for the purposes of logging? We could check if no index constraint is supplied or if the index constraint results in an unconstrained scan. If I understand this correctly I'm on board.

piyush-singh commented 4 years ago

If I have a table with one row that I keep reading, do I need to see that in this log over an over again? Or think of the queues people implement with a table of "pending" items to process. These cases can easily drown out the useful info in the slow queries log - the actually slow queries.

From #45616:

A related setting to set a floor on the number of rows scanned would help with avoiding tables with very few records where indexes are not necessary.

MySQL has a similar configuration option for their slow query log. See the log_queries_not_using_indexes parameter here. They also offer an option to rate limit the collection of queries without indexes to avoid runaway slow query log file growth.

I think it would be worth considering both of those as part of adding this capability.

Do I really need to see the same query over and over again in the log or would I be better served by some sort of a fingerprint+count/recency?

I think the common way to set this up is to have some query digest tool that will summarize these statistics while allowing users to dive in to specific execution if they need to. See mysql's offering and this discussion on slack.

andreimatei commented 4 years ago

It's worth noting this is an opt-in setting that also logs the service time of the query -- having a log entry doesn't imply the query is necessarily bad, it's just to provides observability into queries that could be bad. Additionally, there's a new "slow query log reason" which identifies why a query was logged in the slow query log. This allows users to filter for queries that are over the latency threshold and contain an unconstrained scan if they want.

Having it be opt-in is definitely better than opt-out. But I just feel like that log is not the answer where. When you say that "users... filter for queries", you mean grep or what? This is a text file, so talking about "filtering" is being generous.

Just to make sure I understand this correctly, are you saying we should decouple the notion of a full table scan that dist sql uses to recommend if the plan should be distributed from what qualifies as a full table scan for the purposes of logging? We could check if no index constraint is supplied or if the index constraint results in an unconstrained scan. If I understand this correctly I'm on board.

I'm saying that even when the scan is technically constrained (e.g. skipping over the nulls), we probably still want to consider some queries as "full table scans" for the purposes of this feature. But now that you draw attention to it - yeah, I definitely don't think that the DistSQL decision to distribute should be tangled with considering something to be a "full scan" or not.

I think the common way to set this up is to have some query digest tool that will summarize these statistics while allowing users to dive in to specific execution if they need to. See mysql's offering and this discussion on slack.

Well, are we building that tool? Without it, is dumping random stuff in the log file that currently has a well defined role in catching dynamic badness (as opposed to static badness), the right thing to do?

I would think about the desire to highlight "full table scans" in the same realm as, for example, the desire to suggest index creation to users. We wouldn't suggest index creation by printing a log line 1000 times per second, would we? I'd think about exposing this information starting from query fingerprints and execution statistics, not starting from a query log. AFAICS from #51441, I don't think I'd even be able to tell which table was fully scanned by the queries.

arulajmani commented 4 years ago

"users... filter for queries", you mean grep or what? This is a text file, so talking about "filtering" is being generous.

Users can grep for, for example: {LATENCY_THRESHOLD FULL_TABLE_SCAN}

AFAICS from #51441, I don't think I'd even be able to tell which table was fully scanned by the queries.

I guess this is a question of how we want users to use this log. Is it sufficient to simply log the query and expect users to use EXPLAIN (possibly with opt verbose) to troubleshoot? Or do we want to log the table/secondary index which was fully scanned? What if a query scans multiple tables and secondary indexes? I guess this is part of a bigger question of how log entries should be formatted in the slow query log keeping in mind there are other options we may want to add here.

we probably still want to consider some queries as "full table scans" for the purposes of this feature.

I think this is worth brainstorming -- I'm thinking we should define "full table scans" as "unconstrained scans". For example, if you're doing SELECT * FROM t WHERE t.x > 2, this should be logged as well because we're doing an unconstrained scan (IIUC this encapsulates Andrei's point about Nulls). To that end, would it be a good idea to call this logging for unconstrained scans instead of full table scans? That makes the intention clearer.

andreimatei commented 4 years ago

For example, if you're doing SELECT * FROM t WHERE t.x > 2, this should be logged as well because we're doing an unconstrained scan

If this is unconstrained, what is constrained scan? I would have expected this to be "constrained". What about SELECT * FROM t WHERE t.x > 2 AND t.x < 100 ?

arulajmani commented 4 years ago

Correct me if I'm wrong here, I haven't really looked at this stuff before yesterday, but I was hoping to use: https://github.com/cockroachdb/cockroach/blob/7e858f87301240563a78a44727a920ec66a6a5d7/pkg/sql/opt/constraint/constraint.go#L74 so t.x > 2 would qualify as "unconstrained even though there is a constraint present. On the other hand, t.x > 2 AND t.x < 100 would qualify as constrained.

arulajmani commented 4 years ago

Edit: I'm wrong about that. Unconstrained means both start and end of the span are unset (not one of start/end are set). But for the purposes of this issue, should we be logging queries that scan constraint spans with either a start or end key set but not both?

andreimatei commented 4 years ago

I think it depends... select where x is not null is different from select where scheduled_action_time < now(). I think we should be logging the first but not the second. Modulo the fact that I think we shouldn't "log" anything. The reason why I think of the non-null filters in particular is because I think they're inserted automatically a lot in case of inner joins - and so you don't want to give these queries a pass just because they happened to get a meaningless constraint.

With the risk of derailing this change (which is not the worst outcome :P) I think this deserves an RFC.

solongordon commented 4 years ago

I think Andrei brings up a good point about log spam. Two thoughts:

Maybe rather than logging all full table scans we could continue to only log slow queries but indicate which ones used a full table scan.
We could also indicate in crdb_internal.node_statement_statistics and the Web UI's statements page whether the last execution of the statement used a full table scan.

piyush-singh commented 4 years ago

I like Solon's suggestion - it does not change what is already logged, it surfaces potential causes for poor performance, and it is still exposing information based on the user's threshold for "slow".

We could also indicate in crdb_internal.node_statement_statistics and the Web UI's statements page whether the last execution of the statement used a full table scan.

This would be handy - we are planning to expose some basic notifications to users in the Admin UI for 20.2 to direct their attention to possible problems. Being able to link to a page that summarizes all queries resulting in full table scans would be useful. It's worth noting that we are already flagging full table scans in sampled plans in the statement details page. We sample plans per fingerprint every 5 mins by default and flag any scan nodes with SPANS = ALL. If we were only flagging whether the most recent query resulted in a full table scan, would we expect flapping of that notification?

arulajmani commented 4 years ago

Commenting here to capture the discussion we had offline about this. The plan is to demo how the slow query log looks in my current implementation (from #51441) and get feedback on what workflows various customers want:

Queries can enter slow query log if they don't meet the threshold or if they do a full table scan.
Queries only enter the slow query log if they don't meet the threshold; but if a query performs a full table scan we record this as a possible reason for the slowness.
Full table scans go to a separate log.
Flag query fingerprints that performed a full table scan in their last execution.

andreimatei commented 4 years ago

MySQL can give you the row returned / row processed ratio, which covers the amount of "wasted" work. Consider that too. Not sure how it deals with aggregations.

bdarnell commented 4 years ago

Prior issues on this subject: #41340 #45616 #41341 #20060 #13969 (several of them filed by @awoods187). Can we consolidate?

cockroachdb / cockroach

sql: prevent full table scans #49617