cloudspannerecosystem / spanner-cli

Interactive command line tool for Cloud Spanner
Apache License 2.0
227 stars 29 forks source link

Ability to set deadlines on long running operations #133

Open nielm opened 2 years ago

nielm commented 2 years ago

A useful feature would be to be able to set shorter (or longer!) deadlines on potentially long running operations, specifically SELECT but also INSERT WHERE SELECT, UPDATE and COMMIT.

For example, the default SELECT deadline (ExecuteStreamingSQL) is 3600s or 1hr. This means that if a user executes a SELECT with a full table scan on a large table, it could continue running for up to an hour without the database being aware that, the user has attempted to cancel it by killing the Spanner-CLI process. One can imagine a scenario where a user gets impatient, kills the client process and tries again several times this loading the DB without realising.

(This is unlike other DB engines which use a TCP connection to the CLI client, so that the server can detect when a client has been killed, and terminate any queries),

Suggestion:

yfuruyama commented 2 years ago

Thanks @nielm for raising this issue.

I think the ideal solution for the raised issue would be handling SIGINT in spanner-cli properly and cancel in-flight request if user enters Ctrl-C during running the query. Currently spanner-cli doesn't handle the SIGINT, so it suddenly stops the spanner-cli session when user enters Ctrl-C.

Does this solution make sense for you?

spanner-cli is an interactive tool and not supposed to be used for long-running query. If user wants to run the query that exceeds the default timeout (3,600s), I would suggest them to use gcloud spanner databases execute-sql with --timeout option.

nielm commented 2 years ago

There is a separate issue with long running queries from gcloud which a customer recently raised and I need to confirm further.

Agree that handling sigint to cancel in progress operations is also a good feature request, but imho it is orthogonal to this one.

Spanner-cli does have a non-interactive batch mode which does imply that it can be scripted or used in offline processes.

yfuruyama commented 2 years ago

Yeah! I agree that handling SIGINT and providing timeout option are orthogonal.

Maybe it would be better to provide command-level option for timeout and it applies to every statement in the interactive and non-interactive mode.

I think the statement-level timeout for the interactive mode is difficult to design due to the limited syntax, and even if it's properly designed, there would be little benefit for users.

nielm commented 2 years ago

Agree for command line specification for timeouts/deadlines - there is already an option for priority!