sql-machine-learning / sqlflow

Brings SQL and AI together.
https://sqlflow.org
Apache License 2.0
5.11k stars 703 forks source link

Summarize SQLFlow error message and what we want #2169

Open Yancey1989 opened 4 years ago

Yancey1989 commented 4 years ago

This issue tries to summarize SQLFlow error examples into 3 types according to #2165

Error Messages from Go Codebase

Some cases:

ERROR: runSQLProgram error: GetAlisaTask got a bad result, response={"returnCode":"11020295002","requestId":"0ab411db15880473516257764d0bda","returnMessage":"调用Alisa失败:java.net.SocketException: Connection reset","returnErrorSolution":""}
runSQLProgram error: unsupported attribute model.no_exits

What we want:

Make these error messages more meaningful and give suggestions to users as the following example:

database error: connection reset, please retry it or create and issue on: http://github.com/sql-machine-learning/sqlflow/issues
attributed check error: unsupported attribute "model.no_exits", 
allowed attributes of DNNClassifier please go to http://sqlflow.org/models/dnnclassifier

Error Messages from Generated Code

As case https://github.com/sql-machine-learning/sqlflow/issues/2165#issuecomment-620582099

SQLFlow returns the generated code and Python error stack to users, it's unmeaningful.

What we want:

  1. Do static check as much as possible to avoid submitter program runtime error.
  2. Resolve the error message to a friendly message and back to users.
  3. If 1. and 2. does not affect, should record the backtrace on sqlflowserver and inform email/issue to SQLFlow develops.

Error Messages From Dependent Services

Some generated code submits a distributed job to a cluster .e.g. PAI/Kubernetes.

We should trace the job status and resolve a meaningful error message to users.

typhoonzero commented 4 years ago

So we don't need to define a list of error types? Seems if we have simple and clear error messages return to the users, the error response document is not needed any more.

wangkuiyi commented 4 years ago

@typhoonzero I agree that error types and codes don't seem to be the key to fixing the error messaging problem.