Open martin-sucha opened 3 years ago
Very good point @martin-sucha
The query is not available at the point this is printed.
We have two options:
The second has the benefit of avoiding log spam in case the application goes crazy, and also directly informing the application. The downside is that the application may choose to ignore the warning. Maybe we should do both (and allow disabling 1).
@haaawk does the Java driver (as an example) do anything with warnings returned from the database?
/cc @denesb @psarna
/cc @denesb @psarna
Scylla Rust Driver returns warnings along with other auxiliary information like the paging state and tracing uuid, but it's up to the user to explicitly check if any warnings were present. The warnings are represented as a list of strings, but it's the user's decision whether they want to add code that prints or reports them anywhere, so I assume they usually don't.
Maybe we should have a mode parallel to slow query tracing, where we trace those queries that have any warnings attached to them.
@haaawk does the Java driver (as an example) do anything with warnings returned from the database?
The driver logs the warning with WARN logging level unless it's disabled: https://github.com/scylladb/java-driver/blob/scylla-3.x/driver-core/src/main/java/com/datastax/driver/core/RequestHandler.java#L262
We have two options:
- return it via rpc to the coordinator, and print it there
- return it via rpc to the coordinator, attach a warning the the query result, and pass it to the driver
The second has the benefit of avoiding log spam in case the application goes crazy, and also directly informing the application. The downside is that the application may choose to ignore the warning. Maybe we should do both (and allow disabling 1).
There should definitely be a way to find out which user is causing the error messages on the server side (so the database administrator can do it). With multiple database users (that use different drivers) if I don't know which client is causing the issue then I don't know which team should I ask to check their application.
does the Java driver (as an example) do anything with warnings returned from the database?
For example, gocql allows to get the warning strings, but does not print them by default and the C++ driver seems to do the same.
Maybe we should have a mode parallel to slow query tracing, where we trace those queries that have any warnings attached to them.
This looks like it would be useful. While adding the logs as trace events could work even with probabilistic tracing (although in my case the log message is logged every 5-10 minutes, so I'm not sure what is the chance that it would be randomly traced), specifically tracing those queries would increase the chance that something useful will be stored in the traces.
/cc @denesb @psarna
Scylla Rust Driver returns warnings along with other auxiliary information like the paging state and tracing uuid, but it's up to the user to explicitly check if any warnings were present. The warnings are represented as a list of strings, but it's the user's decision whether they want to add code that prints or reports them anywhere, so I assume they usually don't.
@piodul corrected me, we do log warnings, but currently unconditionally: https://github.com/scylladb/scylla-rust-driver/blob/efb170182f5d1e0b0a467077c627513a7ee47d40/scylla/src/transport/connection.rs#L534-L536
Maybe we should have a mode parallel to slow query tracing, where we trace those queries that have any warnings attached to them.
By the time the warnings are attached, it's too late. The trace session is created before any warnings can be created.
We have two options:
- return it via rpc to the coordinator, and print it there
- return it via rpc to the coordinator, attach a warning the the query result, and pass it to the driver
The second has the benefit of avoiding log spam in case the application goes crazy, and also directly informing the application. The downside is that the application may choose to ignore the warning. Maybe we should do both (and allow disabling 1).
There should definitely be a way to find out which user is causing the error messages on the server side (so the database administrator can do it). With multiple database users (that use different drivers) if I don't know which client is causing the issue then I don't know which team should I ask to check their application.
does the Java driver (as an example) do anything with warnings returned from the database?
For example, gocql allows to get the warning strings, but does not print them by default and the C++ driver seems to do the same.
I think these defaults should be changed to "log warnings (with rate-limit), unless an API method was called to read the warnings". Otherwise, this feature will only be used in cqlsh.
I think these defaults should be changed to "log warnings (with rate-limit), unless an API method was called to read the warnings". Otherwise, this feature will only be used in cqlsh.
+1 We need to propagate more info to allow users to pin point an issue to a query. In addition to just username, the connection info (external IP, port), and a CQL statement will be useful To mitigate the extra info we can propagate the session "id" and correlate the logs elsewhere.
Those defaults are in the client driver, not Scylla. If the driver prints out the warnings, you immediately know what the session is. The driver also has access to the statement.
This is Scylla's bug tracker, to be used for reporting bugs only. If you have a question about Scylla, and not a bug, please ask it in our mailing-list at scylladb-dev@googlegroups.com or in our slack channel.
Installation details Scylla version (or git commit hash): 2021.1.2-0.20210620.7bb9428ae
Log messages should provide a way to link the logged message to the user / query that is related to the issue.
For example, I see the following message in the logs now:
but I don't know which query exceeds the limit. I have tried probabilistic tracing and slow query log, but can't positively say whether any of the queries there is linked to this log message as the log message does not appear in the trace events.
I propose to add username and trace session id / span id (if available) to the log messages. The query string would be nice to have as well (but I'm not sure if it won't cause excessive volume of logs).
There are also other possible alternatives to my proposal, for example all log messages could be logged into the trace session as events.