Open bbeaudreault opened 7 years ago
It may make sense to specifically mark errors from mysql. Even if the last exception above is clearer, people still would come to us not knowing if it's a mysql error or a vitess error.
@harshit-gangal our error handling has improved significantly since this issue was created. Is there something left for us to do before closing it?
We still do not format the way it is shown above. We can keep this open.
The below turned into a lot of words and blocks of code. What i'm looking for in this issue is feedback on the potential changes listed near the bottom. Also, the code blocks below are not line-wrapped due to the formatting of markdown but typically I see them line-wrapped into a big block of text. So keep that in mind in assessing the readability.
For the past couple months I've been intermittently chipping away at removing invalid or
UNKNOWN
error codes returned by vitess, which are less than helpful for users and alerting. At this point that happens very rarely, and when it does I'll fix on a one-off basis.Another problem to fix, however, is the error messages themselves. At Hubspot we have 150+ engineers running apps that interact with vitess. When an error happens, they get exceptions like this, which are very hard to parse and filled with vitess "junk" (in their mind):
or
or
These errors leave an engineer wondering -- why is my code hitting a legacy error? what's the actual issue here? what's the difference between legacy code and code?
I'd like to look at making these errors more user friendly. I dug into the JDBC driver, and it just calls
TextFormat.printToString
on the RPCError (which is a protobuf). This is what determines the format and order of the legacy_code, message, and code fields. I can make things a little more friendly by writing a better formatter which moves these around. I.E.:Just ditching the confusing legacy code makes things slightly better, but it's still not great. There's still a redundant
rpc error: code = Aborted
, the useless (to them) vtgate address, a few cryptic things (to an end-user) inexecInsertUnsharded
andqa_iad-741647200
, the lack of newlines, and non-ideal ordering of fields.Unfortunately the rest of the problem comes from the
message
field, which is constructed somewhere between vttablet and vtgate and would be a pain to re-format in the JDBC due to its unstructured nature.Based on the errors i've seen I can imagine a few potential changes to vterrors and RPCError:
message
field untouched for mysql errors, so that end-users can see what they really care about. Potentially add amysqlError
field if necessary to ensure we don't touch it.trace
field to vterror, and a function for adding a trace to a given vterror. This could be used in places where we add something likeexecInsertUnsharded
or other strings that facilitate debugging more than identify an error. In the JDBC we could have the option to silence these unless a loglevel is chosen.vttablet
field to vterror for when an error comes from a vttablet, which would contain target and usedTablet.callerID
field to vterror.status.FromError(err).GetMessage()
instead offmt.Sprintf("%s", err)
. This would remove the redundantrpc error: code = Aborted
part of the message.vtgate
field to RPCError, rather than prefixing the message in the final milelegacy_code
(if possible) from RPCError. If not possible, I can hide in the JDBC.All of these changes together could result in a message like this:
We could also add a
query
field if we wanted rather than using theduring query: ...
format.Thoughts on these changes, or any ideas on other ways of making user-friendly error messages?