Closed xdrcft8000 closed 7 months ago
Suggested first step:
Write (or find) a hurl test, that creates a release with one revision in it, and another release with two revisions in it as a reproducible example of the problem.
I've created a Hurl test for reproducing this issue here: https://github.com/Swirrl/datahost-prototypes/blob/fd800f2da939d22ac59d664e83754c124ba573aa/datahost-ld-openapi/hurl-scripts/issue-348.hurl
As expected the test currently fails, because when dh:hasRevision
only contains a single item it comes through as a string rather than an array of length 1, so it can't be counted:
error: Filter Error
--> issue-348.hurl:96:46
|
96 | jsonpath "$.contents.[1].['dh:hasRevision']" count == 1
| ^^^^^ invalid filter input: string
|
The handler function for the .../releases
endpoint is here:
https://github.com/Swirrl/datahost-prototypes/blob/dluhc-integration/datahost-ld-openapi/src/tpximpact/datahost/ldapi/handlers.clj#L223-L234
The triples->ld-resource-collection function uses matcha/build
to turn the triples returned from db/get-releases into Clojure maps.
The difference between releases containing single and multiple revisions appears at this point. When there are multiple revisions matcha/build
rolls them up into a set:
whereas when there is only a single revision it comes through by itself, as a value.
The sequence of matcha maps are then passed through to ->json-document (via json-ld/compact), which converts the maps into a string, using jsonista/write-value-as-string.
write-value-as-string
converts the set
of revisions into an array, and leaves the single value as a string.
The stringified JSON document is then passed (along with the context) to the JsonDocument/of function of the Titanium JSON-LD library to convert it into a JSON-LD document (the JsonLd.compact function from that library is also used later on).
Force matcha/build
to always put the dh:hasRevision
stuff into a set?
I've not used matcha before, but from a quick look this doesn't seem to be possible.
Do something with jsonista
It might be possible to use a custom mapper or encoder to make sure that the stringified JSON has the correct shape. Or it might be that we could replace write-value-as-string
with our own map->json-string
function.
Use JSON-LD Framing Titanium JSON-LD appears to support framing. So it may be possible to re-shape the JSON-LD documents once they have been created.
Do it ourselves
Presumably it's possible to go from the Clojure document and context maps straight to a JSON-LD document, instead of doing map -> string -> JSON-LD
. But I don't know enough about the JSON-LD spec to know how difficult that would be.
Option 1 doesn't seem possible. Option 2 seems like too much of a hack. Option 3 seems like more of a 'proper' way of doing it, providing we can come up with the correct frame. Option 4 might be preferable long-term, but feels like it'd be a lot more work than option 3.
Thanks @kjoshi,
I think option 3 is what I was thinking, particularly as we are using the JSON/LD libraries to build those responses.
My suspicion is based on looking at the code and some quick research in the JSON/LD playground is that we might be able to tweak this bit of the context:
To be:
{:contents {"@id" "dh:collection-contents",
"@container" "@set"}
"dh:hasChange" {"@id" "dh:hasChange",
"@container" "@set"}}
But also I think that context file should be represented as JSON/LD as a file on disk, on the resource path, and we should just merge anything dynamic (regarding base URI) into that. There's a place holder for that file here which we should use. i.e. I'd port the statements above into that context, and merge the dynamic stuff into that data with code
My guess is, if we do that, the existing compaction/framing machinery will follow those directives and ensure that if there is ever a single value for this property it is always put into an array.
Our JSON syntax should be specific about the expected one or manyness of certain properties.
For
dh:hasChange
if a release has one revision you will get a single value, and if it has many you will get an array of them. If there are any then this should ideally always be an array (to reduce variance for consumers):i.e. this:
should be this: