fluree / db

Fluree database library
https://fluree.github.io/db/
Other
337 stars 21 forks source link

Format error messages for queries that fail validation #312

Closed zonotope closed 9 months ago

zonotope commented 1 year ago

The query engine defined in #304 returns error responses containing the raw parse validation failures under the reasons key:

{:status 400
 :error :db/invalid-query
 :reasons #:clojure.spec.alpha{:problems ({:path [:where :map], :pred clojure.core/map?, :val "this is wrong too", :via [:fluree.db.query.fql.syntax/analytical-query :fluree.db.query.fql.syntax/where :fluree.db.query.fql.syntax/where-map], :in [:where 0]} {:path [:where :tuple :triple], :pred (clojure.core/fn [%] (clojure.core/or (clojure.core/nil? %) (clojure.core/sequential? %))), :val "this is wrong too", :via [:fluree.db.query.fql.syntax/analytical-query :fluree.db.query.fql.syntax/where :fluree.db.query.fql.syntax/where-tuple :fluree.db.query.fql.syntax/triple], :in [:where 0]} {:path [:where :tuple :binding], :pred clojure.core/coll?, :val "this is wrong too", :via [:fluree.db.query.fql.syntax/analytical-query :fluree.db.query.fql.syntax/where :fluree.db.query.fql.syntax/where-tuple], :in [:where 0]} {:path [:where :tuple :remote], :pred clojure.core/coll?, :val "this is wrong too", :via [:fluree.db.query.fql.syntax/analytical-query :fluree.db.query.fql.syntax/where :fluree.db.query.fql.syntax/where-tuple], :in [:where 0]})
                               :spec :fluree.db.query.fql.syntax/analytical-query
                               :value {:context {:ex "http://example.org/ns/"}, :select "invalid query", :where ["this is wrong too"], :opts {:sources {}, :max-fuel 1000000, :fuel nil}}}}

We should format these error messages to make them more clear, actionable, and user friendly.

~Blocked by #304~

zonotope commented 1 year ago

Since switching our validation implementation to malli from clojure.spec, we have more options for implementing this. A good low effort first step that probably gets us 90% of what we want could be enabling humanized error messages.

If humanized error messages prove not to be good enough, we can then implement custom error formatters that offer more flexibility.

aaj3f commented 1 year ago

Moving to 3.0.0-beta1 for improved DX / beta adoption (per discussion in internal slack)

cap10morgan commented 1 year ago

@zonotope & @aaj3f I don't see any remaining examples of passing through malli query validation error data that aren't already being humanized (work was done on this w/o referencing this card). Are there specific instances you all are aware of that still need to be handled?

mpoffald commented 1 year ago

@cap10morgan I think the example that caused us to resurrect this issue was providing an invalid query key. Like I think @aaj3f tried using whereFoobar instead of where and got:

{
    "value": {
        "query": {
            "select": [
                "?name"
            ],
            "whereFoobar": [
                [
                    "?s",
                    "schema:name",
                    "?name"
                ]
            ]
        },
        "ledger": "cookbook/base"
    },
    "type": "reitit.coercion/request-coercion",
    "coercion": "malli",
    "in": [
        "request",
        "body-params"
    ],
    "humanized": {
        "query": {
            "where": [
                "missing required key"
            ]
        }
    }
}
mpoffald commented 1 year ago

that looks like it is humanized though? So if it needs to be better than this, then we may need some more guidance on what that should look like

aaj3f commented 1 year ago

@mpoffald @cap10morgan if nothing else, I think we should return consistent error formats on 400s. It seems wrong to me that some errors (captured by the query resolution paths) always have two keys, error and message and then others (captured by malli) have a totally different error message shape, e.g.

Error #1

{
    "error": "db/invalid-query",
    "message": "Invalid predicate: schema:foobar"
}

Error #2

{
    "value": {
        "query": {
            "select": [
                "?name"
            ],
            "whereFoobar": [
                [
                    "?s",
                    "schema:name",
                    "?name"
                ]
            ]
        },
        "ledger": "cookbook/base"
    },
    "type": "reitit.coercion/request-coercion",
    "coercion": "malli",
    "in": [
        "request",
        "body-params"
    ],
    "humanized": {
        "query": {
            "where": [
                "missing required key"
            ]
        }
    }
}
aaj3f commented 1 year ago

Not only for readability (I think v3 users might scratch their heads at coercion: "malli") but for consistent error handling in client applications.

cap10morgan commented 1 year ago

@aaj3f On latest main HEAD a query w/ :whereFoobar produces the following error:

#error {
│  :cause "Invalid Query"
│  :data {:status 400, :error :db/invalid-query, :reasons {:where ["missing required key"], :select ["invalid type" "invalid type"], :whereFoobar ["invalid type" "invalid type"]}} ...}

I can add a :message key to that if we want, but it will probably just contain a repeat of the "Invalid Query" that we have under the :cause key (unless you or anyone else has a better idea of what to put there?).

zonotope commented 1 year ago

@aaj3f On latest main HEAD a query w/ :whereFoobar produces the following error:

#error {
│  :cause "Invalid Query"
│  :data {:status 400, :error :db/invalid-query, :reasons {:where ["missing required key"], :select ["invalid type" "invalid type"], :whereFoobar ["invalid type" "invalid type"]}} ...}

I can add a :message key to that if we want, but it will probably just contain a repeat of the "Invalid Query" that we have under the :cause key (unless you or anyone else has a better idea of what to put there?).

The malli "humanized" error messages probably aren't enough for a person using Fluree who didn't take part in implementing it. "Invalid query" for every possible error is also extremely frustrating for a user.

I think we should write code that takes malli error maps, filters out the irrelevant info (of which there's a lot), and turns the remaining relevant info into English.

So the message for this specific error should be something like "Supplied query is missing required where clause. Unrecognized query parameter 'whereFoobar'".

cap10morgan commented 1 year ago

There is a ton more work that could conceptually fall under this issue but would be a huge scope explosion for what this started out as. We should design, write up, and prioritize that additional work in future issues.

cap10morgan commented 1 year ago

This is going to be an uber-issue to cover the overall effort to improve these validation error messages. We will hopefully link to separate issues that cover various categories of errors (to better scope and parallelize the work) and when all of those are done, this can be closed.

cap10morgan commented 1 year ago

Sub-issues that fall under the scope of this (please edit this comment to add to the list and strikethrough completed sub-issues rather than adding additional comments):

mpoffald commented 12 months ago

I reviewed error messages from v2 vs v3, here's what I found regarding regressions that seemed to be introduced by the move to malli.

We will probably never be done improving error messages, so this can't be a truly exhaustive list, but I think by addressing these we can improve our overall approach to reworking/presenting malli's error messages.

Inscrutable "invalid type" errors are certainly a common theme here.

Some caveats:

Consistent error formats

In many cases, our error messages were fine, but the keys in the map aren't consistent with other errors. We should use error and message keys consistently.

Example:

{
    "coercion": "malli",
    "humanized": {
        "history": [
            "should be a string",
            "should be a keyword",
            "invalid type",
            "invalid type",
            "invalid type"
        ],
        "malli/error": [
            "Must supply either a :history or :commit-details key."
        ],
        "t": [
            "missing required key"
        ]
    },
    "in": [
        "request",
        "body-params"
    ],
    "type": "reitit.coercion/request-coercion",
    "value": {
        "from": "fluree/387028092977569",
        "history": null
    }
}

This issue is common to a lot of our error messages and we do this wrapping in one place, so I'll elide pointing this out in the rest of the list.

History queries

  1. Missing subject

Slightly different formatting issue, the useful explanation is buried in [:data :message :malli/error] and [:data :message :t].

@(fluree/history ledger {:history nil})

;;=>
#error
{:cause "History query not properly formatted. Provided {:history nil}"
 :data {:status 400, :message {:history ["should be a string" "should be a keyword" "invalid type" "invalid type" "invalid type"], :t ["missing required key"], :malli/error ["Must supply either a :history or :commit-details key."]}, :error :db/invalid-query}}

V2: https://github.com/fluree/db/blob/maintenance/v2/src/fluree/db/api.clj#L809

(throw (ex-info (str "Please specify an subject for which to search history. Provided: " history)
        {:status 400
         :error  :db/invalid-query}))
  1. t values out of range

This doesn't clearly explain what went wrong:

@(fluree/history ledger {:commit-details true :t {:from -1 :to 0}})

;;=>
#error
{:cause "History query not properly formatted. Provided {:commit-details true, :t {:from -1, :to 0}}"
                                         ;;This message needs to be improved, and the whole thing reformatted
 :data {:status 400, :message {:t {:from ["should be :latest" "should be at least 0" "should match regex"]}}, :error :db/invalid-query}
 :via
 [{:type clojure.lang.ExceptionInfo
   :message "History query not properly formatted. Provided {:commit-details true, :t {:from -1, :to 0}}"
   :data {:status 400, :message {:t {:from ["should be :latest" "should be at least 0" "should match regex"]}}, :error :db/invalid-query}
   :at [fluree.db.api.query$history$fn__60507$state_machine__7787__auto____60512$fn__60515$fn__60524 invoke "query.cljc" 113]}]}

AFAIK this is equivalent to block queries in V2: https://github.com/fluree/db/blob/maintenance/v2/src/fluree/db/api/query.cljc#L139

(when (> block-start db-block)
  (throw (ex-info (str "Start block is out of range for this ledger. Start block provided: " (pr-str block-start) ". Database block: " (pr-str db-block)) {:status 400 :error :db/invalid-query})))

Select clause

  1. Type error:

This doesn't tell you much about what went wrong, or how to do it correctly:

{:select [+]
 :where  '[[?s ?p ?o ]]}

 ;;=>
#error
{:cause "Invalid query: {:select [#function[clojure.core/+]], :where [[?s ?p ?o]], :opts {:issuer nil}} - select: unknown error should be a string should be a keyword unknown error unknown error invalid type"
 :data {:status 400, :error :db/invalid-query}
 :via
 [{:type clojure.lang.ExceptionInfo
   :message "Invalid query: {:select [#function[clojure.core/+]], :where [[?s ?p ?o]], :opts {:issuer nil}} - select: unknown error should be a string should be a keyword unknown error unknown error invalid type"
   :data {:status 400, :error :db/invalid-query}
   :at [fluree.db.query.fql.syntax$coerce_query invokeStatic "syntax.cljc" 212]}]}

V2: https://github.com/fluree/db/blob/maintenance/v2/src/fluree/db/query/analytical_parse.cljc#L282

(or (every? #(or (string? %) (map? %)) select-smt)
    (throw (ex-info (str "Invalid select statement. Every selection must be a string or map. Provided: " select-smt) {:status 400 :error :db/invalid-query})))

Where clause

Doesn't say which part of the query should have at most 1 element, extraneous "invalid type"s.

  1. Multiple entries in map clause
{:select ['?name '?email]
 :where  [['?s :type :ex/User]
      ['?s :schema/age '?age]
      ['?s :schema/name '?name]
      {:union [[['?s :ex/email '?email]]
           [['?s :schema/email '?email]]]
       :filter ["(> ?age 30)"]}]}
       ;;=>
#error
{:cause "Invalid query: {:select [?name ?email], :where [[?s :type :ex/User] [?s :schema/age ?age] [?s :schema/name ?name] {:union [[[?s :ex/email ?email]] [[?s :schema/email ?email]]], :filter [\"(> ?age 30)\"]}], :opts {:issuer nil}} - where:    [\"should have at most 1 elements\" \"invalid type\" \"invalid type\"]"
 :data {:status 400, :error :db/invalid-query} ,,,}

V2: https://github.com/fluree/db/blob/maintenance/v2/src/fluree/db/query/analytical_parse.cljc#L464

(throw (ex-info (str "Where clause maps can only have one key/val, provided: " map-clause)
            {:status 400 :error :db/invalid-query}))
  1. Unrecognized op in where clause map

The most useful part, [\"should be either :filter, :optional, :union or :bind\"], is hard to find.

Also "Invalid dispatch value" is not user-friendly information

{:select ['?name '?age]
 :where  [['?s :type :ex/User]
      ['?s :schema/age '?age]
      ['?s :schema/name '?name]
      {:foo "(> ?age 45)"}]}
      ;;=>
#error
{:cause "Invalid query: {:select [?name ?age], :where [[?s :type :ex/User] [?s :schema/age ?age] [?s :schema/name ?name] {:foo \"(> ?age 45)\"}], :opts {:issuer nil}} - where:    {:foo [\"should be either :filter, :optional, :union or :bind\"], :malli/error [\"invalid dispatch value\" \"invalid type\" \"invalid type\"]}"
 :data {:status 400, :error :db/invalid-query}
 :via
 [{:type clojure.lang.ExceptionInfo
   :message "Invalid query: {:select [?name ?age], :where [[?s :type :ex/User] [?s :schema/age ?age] [?s :schema/name ?name] {:foo \"(> ?age 45)\"}], :opts {:issuer nil}} - where:    {:foo [\"should be either :filter, :optional, :union or :bind\"], :malli/error [\"invalid dispatch value\" \"invalid type\" \"invalid type\"]}"
   :data {:status 400, :error :db/invalid-query}
   :at [fluree.db.query.fql.syntax$coerce_query invokeStatic "syntax.cljc" 212]}]
 ,,,}

V2: https://github.com/fluree/db/blob/maintenance/v2/src/fluree/db/query/analytical_parse.cljc#L501

(throw (ex-info (str "Invalid where clause, unsupported where clause operation: " clause-type)
        {:status 400 :error :db/invalid-query}))
  1. Where clause is not sequential:

Doesn't tell you what exactly is the wrong type, or what the type of that thing should be:

{:select '[?s ?o]
 :where  's}
 ;;=>
#error
{:cause "Invalid query: {:select [?s ?o], :where s, :opts {:issuer nil}} - where: invalid type"
 :data {:status 400, :error :db/invalid-query}
 :via
 [{:type clojure.lang.ExceptionInfo
   :message "Invalid query: {:select [?s ?o], :where s, :opts {:issuer nil}} - where: invalid type"
   :data {:status 400, :error :db/invalid-query}
   :at [fluree.db.query.fql.syntax$coerce_query invokeStatic "syntax.cljc" 212]}]
 ,,,}

V2: https://github.com/fluree/db/blob/maintenance/v2/src/fluree/db/query/analytical_parse.cljc#L685

(when-not (sequential? where)
    (throw (ex-info (str "Invalid where clause, must be a vector of tuples and/or maps: " where)
                    {:status 400 :error :db/invalid-query})))
  1. Invalid group-by clause

Doesn't tell you what a valid group-by looks like:

'{:select  [?name (count ?favNums)]
  :where   [[?s :schema/name ?name]
        [?s :ex/favNums ?favNums]]
  :group-by {}}
;;=>
#error {
       :cause "Invalid query: {:select [?name (count ?favNums)], :where [[?s :schema/name ?name] [?s :ex/favNums ?favNums]], :group-by {}, :opts {:issuer nil}} - group-by: unknown error invalid type"
       :data {:status 400, :error :db/invalid-query}
       :via
       [{:type clojure.lang.ExceptionInfo
         :message "Invalid query: {:select [?name (count ?favNums)], :where [[?s :schema/name ?name] [?s :ex/favNums ?favNums]], :group-by {}, :opts {:issuer nil}} - group-by: unknown error invalid type"
         :data {:status 400, :error :db/invalid-query}
         :at [fluree.db.query.fql.syntax$coerce_query invokeStatic "syntax.cljc" 212]}]
    ,,,
       }

V2: https://github.com/fluree/db/blob/maintenance/v2/src/fluree/db/query/fql.cljc#L332

(cond (vector? groupBy) [true (map symbol groupBy)]
      (string? groupBy) [false [(symbol groupBy)]]
      :else (throw (ex-info
             (str "Invalid groupBy clause, must be a string or vector. Provided: " groupBy)
  1. Invalid operation in order-by

The "should be a list" is misleading, and it doesn't tell you what the valid operations are:

{:select  ['?name '?favNums]
:where   [['?s :schema/name '?name]
      ['?s :schema/age '?age]
      ['?s :ex/favNums '?favNums]]
:orderBy ['(foo  ?favNums)]}
;;=>
#error {
       :cause "Invalid query: {:context {:schema \"http://schema.org/\", :ex \"http://example.org/ns/\"}, :select [?name ?favNums], :where [[?s :schema/name ?name] [?s :schema/age ?age] [?s :ex/favNums ?favNums]], :orderBy [(foo ?favNums)], :opts {:issuer nil}} - orderBy: unknown error should be a list"
       :data {:status 400, :error :db/invalid-query}
       :via
       [{:type clojure.lang.ExceptionInfo
         :message "Invalid query: {:context {:schema \"http://schema.org/\", :ex \"http://example.org/ns/\"}, :select [?name ?favNums], :where [[?s :schema/name ?name] [?s :schema/age ?age] [?s :ex/favNums ?favNums]], :orderBy [(foo ?favNums)], :opts {:issuer nil}} - orderBy: unknown error should be a list"
         :data {:status 400, :error :db/invalid-query}
         :at [fluree.db.query.fql.syntax$coerce_query invokeStatic "syntax.cljc" 212]}]
       :trace
       ,,,
       }

V2: https://github.com/fluree/db/blob/maintenance/v2/src/fluree/db/query/analytical_parse.cljc#L310

(when-let [orderBy (or (:orderBy opts) orderBy)]
  (if (or (string? orderBy)
      (and (vector? orderBy)
           (#{"DESC" "ASC"} (first orderBy))))
    (if (vector? orderBy) orderBy ["ASC" orderBy])
    (throw (ex-info (str "Invalid orderBy clause, must be variable or two-tuple formatted ['ASC' or 'DESC', var]. Provided: " orderBy)
            {:status 400
             :error  :db/invalid-query}))))
  1. Improperly formatted bind:

Needs more explanation of "invalid type":

{:select   [?firstLetterOfName ?name ?canVote]
 :where    [[?s :schema/age ?age]
            [?s :schema/name ?name]
            {:bind [?canVote           (>= ?age 18)]}]}
            ;;=>
#error {:cause "Invalid query: {:select [?firstLetterOfName ?name ?canVote], :where [[?s :schema/age ?age] [?s :schema/name ?name] {:bind [?canVote (>= ?age 18)]}], :opts {:issuer nil}} - where:   {:bind [\"invalid type\"], :malli/error [\"invalid type\" \"invalid type\"]}"
        :data {:status 400, :error :db/invalid-query}
        ,,,}

V2: https://github.com/fluree/db/blob/maintenance/v2/src/fluree/db/query/analytical_parse.cljc#L492

(throw (ex-info (str "Invalid where clause, 'bind' must be a map with binding vars as keys "
                     "and binding scalars, or aggregates, as values.")
                {:status 400 :error :db/invalid-query}))
  1. Filter clause type errors:

More "invalid type"s that don't give any useful information:

;;1.
{:select ['?name '?age]
 :where  [['?s :type :ex/User]
      ['?s :schema/age '?age]
      ['?s :schema/name '?name]
      {:filter "(> ?age 45)"}]}
      ;;=>
;;#error
{:cause "Invalid query: {:select [?name ?age], :where [[?s :type :ex/User] [?s :schema/age ?age] [?s :schema/name ?name] {:filter \"(> ?age 45)\"}], :opts {:issuer nil}} - where:    {:filter [\"invalid type\"], :malli/error [\"invalid type\" \"invalid type\"]}"
 :data {:status 400, :error :db/invalid-query}
 :via
 [{:type clojure.lang.ExceptionInfo
   :message "Invalid query: {:select [?name ?age], :where [[?s :type :ex/User] [?s :schema/age ?age] [?s :schema/name ?name] {:filter \"(> ?age 45)\"}], :opts {:issuer nil}} - where:    {:filter [\"invalid type\"], :malli/error [\"invalid type\" \"invalid type\"]}"
   :data {:status 400, :error :db/invalid-query}
   :at [fluree.db.query.fql.syntax$coerce_query invokeStatic "syntax.cljc" 212]}] ,,,}

;;2
{:select ['?name '?age]
 :where  [['?s :type :ex/User]
      ['?s :schema/age '?age]
      ['?s :schema/name '?name]
      {:filter :foo}]}
      ;;=>
#error
{:cause "Invalid query: {:select [?name ?age], :where [[?s :type :ex/User] [?s :schema/age ?age] [?s :schema/name ?name] {:filter :foo}], :opts {:issuer nil}} - where:    {:filter [\"invalid type\"], :malli/error [\"invalid type\" \"invalid type\"]}"
 :data {:status 400, :error :db/invalid-query}}

V2 examples:

https://github.com/fluree/db/blob/maintenance/v2/src/fluree/db/query/analytical.cljc#L762

(throw (ex-info (str "Filter must be enclosed in square brackets. Provided: " filters)
        {:status 400
         :error  :db/invalid-query}))

https://github.com/fluree/db/blob/maintenance/v2/src/fluree/db/query/analytical_parse.cljc#L426

(if-not (sequential? filter)
  (throw (ex-info (str "Filter clause must be a vector/array, provided: " filter)
          {:status 400 :error :db/invalid-query}))