Open EricZinda opened 3 years ago
Coming back to this and summarizing: The proposed JSON format is:
Special keys: ": string _: variable {: dict [: bad list ': unrepresentable
Some things I'd like to fix if possible:
Designing something like this is IMO the way to go. JSON seems great it exchanging data that satisfies a nice and well defined object schema. Trying to represent another dynamic data representation isn't great as one typically end up with a {"type": "...", ...}
which is indeed a little verbose. As is though, it gets hard to identify for example a compound. That is an object with a single key that is not a reserved key (I think). Also, some stuff gets ambiguous, which is why we get an unrepresentable category. What about this.
json
and atom keys <-> objectIt only uses JSON objects where needed. There are two corner cases that we can handle special: cyclic terms can be e.g., {"t":"ct", "v":Skeleton, "u": Unifications} and terms with attributed variables. We can use copy term and represent hthis as {"t":"av", "v":Term, "c":ListOfConstraints}
P.s. I consider json(...) old school.
This can (if I didn't make a mistake) represent any Prolog term and is almost non-ambiguous. It allows for representing normal application data quite nicely (numbers, strings and json{...} dicts). Only the stuff that doesn't fit JSON naturally gets an object. Some problems remain:
The true/false/null issue can be avoided mapping Prolog strings to JSON strings, the reserved atoms to themselves and other atoms to {"t":"a", "v":String}. This comes with its own problems because SWI-Prolog introduced strings when atoms already got commonly used for what strings where intended to do. ECLiPSe introduced them early and that community has less of a problem.
I haven't exhaustively reviewed your list of cases, but I agree with the ones you wrote down and that we should go with that general approach.
A couple of questions here @JanWielemaker:
[Side Note: Pengines needs support for outputting arbitrary JSON since it is designed to support integration into existing/arbitrary systems and needs that flexibility. I'd argue that MQI can make due with one, good canonical representation since it is designed to be an implementation detail of an existing application (in that sense it is like an Application Binary Interface (ABI)) I.e. we don't need to worry too much about the user being able to control the format (although pengines does!) just having a single good one. That said, I could imagine, as we've discussed, having support in MQI for different wire formats over time but for different reasons than pengines: things like performance. Maybe it amounts to the same thing but coming at it from different perspectives.]
It seems like we have some choices for the JSON object format, and I think any could work but some are more readable. I'll illustrate for the compound case of foo(arg1, arg2)
and a string "a string":
Updated Original Proposal: Agree we need a way to quickly identify a compound. I think we could say that all the "reserved keys" must start with "$" to make that easy. To represent a compound that just happens to be a reserved key, the caller could surround with quotes (i.e. use "'$s'"). We could use your updated list of keys as the "reserved keys": compound: {"foo": ["arg1", "arg2"]} compound with name == reserved key "$s": {"'$s'": ["arg1", "arg2"]} string: {"$s": "a string"} Python check first argument: if myjson["foo"][0] == "arg1": Python check for compound: if first(iter(myjson))[0] != "$":
Your proposal above: compound: {"t": "t", "f": "foo", "a":["arg1", "arg2"]} string: {"t":"s", "v":"a string"} Python check first argument: if myjson["f"] == "foo" and myjson["a"][0] == "arg1": Python check for compound: if myjson["t"] == "t":
(just an idea, maybe a little more readable?) Modified version of your proposal that uses position in a list instead of introducing keys: compound: {"t": ["foo", ["arg1", "arg2"]]} string: {"s":"a string"} Python check first argument: if myjson["t"][0] == "foo" and myjson["t"][1][0] == "arg1":
(another idea) Modified version of your proposal that just uses a list (no keys at all) and position in a list instead of any keys: compound: ["t", "foo", ["arg1", "arg2"]] string: ["s", "a string"] Python check first argument: if myjson[0] == "t" and mjson[1] == "foo" and myjson[2][0] == "arg1":
My vote is for #1:
Some notes after a discussion about the above topics:
We agreed that the best approach would be a merger of option 1 and 2 above, preserving the best of both:
The full, round-trippable JSON format would (perhaps optionally) include, in each JSON dict, a type key named "$t" that indicates the type of the Prolog term. The names of and number of the rest of the keys would depend on the type. Including the "$t" argument allows for round tripping. For example:
[Note that whether atoms or strings get to be the default JSON string can be switched as an option]
atom <-> string "a string": {"$t": "s", "v": "a string"} integer <-> integer float <-> float list <-> list foo(arg1, arg2): {"$t": "t", "foo": ["arg1", "arg2"]}
This approach allows the non-Prolog client to use a really nice interface to access terms if their structure is known ahead of time (which is often the case) like this, for example, in Python:
# term = {"$t": "t", "foo": ["arg1", "arg2"]}
print(term["foo"][0])
arg1
As mentioned above, this does mean that the term_to_json/json_to_term is defining a particular canonical JSON "schema" that must be conformed to. As discussed in previous posts above, this is OK since these predicates are intended to be used as an interface or an ABI, not as a general purpose generator of arbitrary JSON documents. There are other predicates that allow building arbitrary JSON in SWI Prolog.
Phase 1:
Phase 2:
This issue has been mentioned on SWI-Prolog. There might be relevant details there:
https://swi-prolog.discourse.group/t/wiki-discussion-swi-prolog-in-the-browser-using-wasm/5651/75
The context in which this is written. Came here because of (ref)
I have been discussing a new JSON format with @ericzinda at Consider adding an option to use a different JSON Format · Issue #4 · SWI-Prolog/packages-mqi · GitHub 1
so my mind set is that this will/could become a/the new JSON package for SWI-Prolog and I often use JSON so have a vested interest.
As a suggestion it might help to think of this along the lines of syntax and semantics.
I see JSON as a syntax specification like INI files. (ref)
Then for a specific need you add semantics. A case in point is the INI files used with ODBC. (ref)
If you want to validate JSON semantics there is JsonSchema but I don't see that being widely used, but perhaps it should be.
When I read the above I don't get the feeling that a clear separation is present. It seems the word JSON is used when perhaps JSON instance should be used and this is discussing the schema of that instance. As this is also talking about a specific instance it needs a name to make it easier to identify.
In looking at this as a schema, it should include a version number for when things change and break.
Was expecting to see references (think JSON-LD)
For single letter for types a name should also be allowed. (think command line arguments -
with single letter and --
with name).
These are just my thoughts, feel free to ignore and/or disregard.
EDIT
Gavin noted a better reference JSON Schema.
This issue has been mentioned on SWI-Prolog. There might be relevant details there:
https://swi-prolog.discourse.group/t/swi-prolog-in-the-browser-using-wasm/5650/1
I worked on typing the SWI-Prolog wasm interface API in TypeScript and I found compound to be problematic:
foo(12,34)
as {"foo":[12,34]}
This makes the corresponding object shape impossible to be defined statically. The interface currently adds tag $t
: 't' anyway and arguments()
method that makes the object a lot more usable. https://github.com/SWI-Prolog/swipl-devel/blob/7a546d6e9e3df6d15343894a71405d5ff1bd712d/src/wasm/prolog.js#L85
I would just use foo(12,34)
as {"$t": "t"; "a":[12,34]}
with predictable property "a" standing for the arguments. This shape is easily definable in static type systems. It makes corresponding JSON schema also much easier to write.
I tend to agree. That is why I wrapped the thing into a JavaScript class for the WASM version. The story may look different from Python?
I worked on typing the SWI-Prolog wasm interface API in TypeScript
The thumbs up is for making progress on using types with an interface. Glad to see someone is doing work on this.
From the swi-prolog boards
Note that the system is using the builtin term_to_json/2 since that is what is used by the http package already.
[1] If it fits in a pointer-size signed integer (or in a double). Otherwise translate to an “unrepresentable” functor, because many JSON libraries (including, of course, Javascript itself) have limitations about the size of numbers they’ll read.
[2] Between atoms and SWI strings, atoms get a lot more use, so they get to use the bare JSON string representation.
[3] Any compound whose functor can be represented as an unquoted atom will be represented this way. Otherwise wrap with
T0 =.. L, T1 =.. ['\'',compound|L]
.[4] This can be used for any unrepresentable type, or any value outside the representable domain of one of the other above types.
This keeps the representation compact (not actually a small consideration, if you’re transferring large data sets), makes it more readable (functor names before arguments), and gives every value (modulo dicts, whose keys can be reordered, and floats which are weird) a single canonical, invariant representation - which means I don’t even need a JSON library, if my results are predictable. Thus:
true([[threads(language_server1_conn2_comm, language_server1_conn2_goal)]]) % Original: {"args": [[[{"args": ["language_server1_conn2_comm", "language_server1_conn2_goal"], "functor":"threads"}]]], "functor":"true"} % New: {"true": [[[{"threads": ["language_server1_conn2_comm", "language_server1_conn2_goal"]}]]]}