simonw / datasette

An open source multi-tool for exploring and publishing data
https://datasette.io
Apache License 2.0
9.61k stars 690 forks source link

Figure out design for JSON errors (consider RFC 7807) #1875

Open simonw opened 2 years ago

simonw commented 2 years ago

https://datatracker.ietf.org/doc/draft-ietf-httpapi-rfc7807bis/ is a brand new standard.

Since I need a neat, predictable format for my JSON errors, maybe I should use this one?

simonw commented 2 years ago

I started a conversation about JSON error standards on Mastodon here: https://fedi.simonwillison.net/web/@simon/109338725610487457

Quite a few people pointed to this RFC independently.

simonw commented 2 years ago

Here's the most relevant example from the RFC spec:

   POST /details HTTP/1.1
   Host: account.example.com
   Accept: application/json
   {
     "age": 42.3,
     "profile": {
       "color": "yellow"
     }
   }
   HTTP/1.1 400 Bad Request
   Content-Type: application/problem+json
   Content-Language: en
{
    "type": "https://example.net/validation-error",
    "title": "Your request is not valid.",
    "errors": [
        {
            "detail": "must be a positive integer",
            "pointer": "#/age"
        },
        {
            "detail": "must be 'green', 'red' or 'blue'",
            "pointer": "#/profile/color"
        }
    ]
}
simonw commented 2 years ago

That's using JSON Pointer: https://www.rfc-editor.org/rfc/rfc6901

There's a Python library for that here https://github.com/stefankoegl/python-json-pointer/blob/master/jsonpointer.py - which looks simple and clean and well maintained and documented, but it only handles the "what is at this pointer within this JSON object" case - I need to generate the correct JSON pointer to explain where my error is.

So I think I'll end up hand-rolling this.

simonw commented 2 years ago

Spec looks pretty simple:

A JSON Pointer is a Unicode string (see [RFC4627], Section 3) containing a sequence of zero or more reference tokens, each prefixed by a / (%x2F) character.

Because the characters ~ (%x7E) and / (%x2F) have special meanings in JSON Pointer, ~ needs to be encoded as ~0 and / needs to be encoded as ~1 when these characters appear in a reference token.

simonw commented 2 years ago

TIL: https://til.simonwillison.net/json/json-pointer

simonw commented 2 years ago

Worth noting this bit in RFC 7807:

The fictional problem type here defines the "errors" extension, an array that describes the details of each validation error. Each member is an object containing "detail" to describe the issue, and "pointer" to locate the problem within the request's content using a JSON Pointer [JSON-POINTER].

So the list of "errors" with JSON Pointer isn't technically part of the spec, it's an imaginary extension.

It fits what I need to do though, so I'm inclined to stick with it anyway.

simonw commented 2 years ago

Rough initial prototype:

diff --git a/datasette/views/table.py b/datasette/views/table.py
index 8b987221..518ac578 100644
--- a/datasette/views/table.py
+++ b/datasette/views/table.py
@@ -1103,19 +1103,30 @@ class TableInsertView(BaseView):
         except json.JSONDecodeError as e:
             return _errors(["Invalid JSON: {}".format(e)])
         if not isinstance(data, dict):
-            return _errors(["JSON must be a dictionary"])
+            return _errors([{"detail": "JSON must be a dictionary", "pointer": "#/"}])
         keys = data.keys()

         # keys must contain "row" or "rows"
         if "row" not in keys and "rows" not in keys:
             return _errors(['JSON must have one or other of "row" or "rows"'])
         rows = []
+        was_single_row = False
         if "row" in keys:
             if "rows" in keys:
-                return _errors(['Cannot use "row" and "rows" at the same time'])
+                return _errors(
+                    [
+                        {
+                            "detail": 'Cannot use "row" and "rows" at the same time',
+                            "pointer": "#/row",
+                        }
+                    ]
+                )
+            was_single_row = True
             row = data["row"]
             if not isinstance(row, dict):
-                return _errors(['"row" must be a dictionary'])
+                return _errors(
+                    [{"detail": '"row" must be a dictionary', "pointer": "#/row"}]
+                )
             rows = [row]
             data["return"] = True
         else:
@@ -1152,9 +1163,12 @@ class TableInsertView(BaseView):
             invalid_columns = set(row.keys()) - columns
             if invalid_columns:
                 errors.append(
-                    "Row {} has invalid columns: {}".format(
-                        i, ", ".join(sorted(invalid_columns))
-                    )
+                    {
+                        "detail": "Invalid columns: {}".format(
+                            ", ".join(sorted(invalid_columns))
+                        ),
+                        "pointer": "#/blah/",
+                    }
                 )
         if errors:
             return _errors(errors)