Closed alamers closed 4 years ago
I did a quick investigation of REST hyperlinking schemes. For self-links the many variants of HATEOAS really come down to two: a. JSON-LD (what DataLinker uses), which uses "@ID" : "https:/....." or b. Other HATEOAS methods which use "_link" : { "rel": "self", "href": "https://...." } Both are pretty easy to implement. The issue is that choosing one or other would seem to commit us to supporting that linking method going forward for other, more complex links.
The third option is to use neither, but "roll our own" - for instance "self-url" : "https://...". Simpler but way less standard.
I'm going to add comments from email conversations to this thread to capture them here for reference by other contributors.
On 25 June 2019, Arjan wrote:
Was indeed good to meet everyone in ‘real life’ :)
Wrt your 2nd and 3rd points; I’ll leave that to the experts.
Wrt to the hyperlinking: just a few thoughts here:
What are your thoughts on compulsary vs recommended in this area? The way I see it, the JSON-Schema’s are the core / compulsary part of the standard (with possibility of extension). The URL scheme will be a recommendation. Do we make the hyperlinking a recommendation / optional part of the schema’s?
So, no strong opinion on my side other than a slight preference for either JSON-LD or Hyperschema.
On 27 June 2019, Andrew wrote:
Thanks for the feedback Arjan,
Regarding the hyperlinking: compulsory vs optional. I see that specifying linking URLs is part of the data returned from the API, so it should be specified in the schema (ie, part of the standard), but obviously the URL fields themselves should be optional, as with many other data members. The reason for specifying this as part of the standard is so that if it is used clients will know how to interpret it. There are very few standard link relations that are relevant to our domain (unlike RSS feeds for example), so I think we need to specify the small set that we need.
I also prefer either JSON-LD or JSON Hyperschema over HAL. We had gone with JSON-LD in DataLinker because:
a. The involvement of Google, Microsoft, and others in JSON-LD gave us some confidence in the level of community support; b. Object references (URLs) could be named components of the data schema, rather than having to be discovered in an array of link description objects (this could still be done in Hyperschema by placing links in appropriate objects); and c. There was already a body of JSON-LD objects at schema.org which we could reference rather than re-define (for instance, Person and Organization).
However, we can make our own decision here on what to use.
It may be that we can define schemas in such as way that there is no great difference between JSON-Hyperschema and JSON-LD. For instance let’s imagine that an animal had a “sire” property which contained the ID of the sire and a link to the sire’s animal resource.
JSON-LD:
“sire” : { “id”: {“scheme”: “org.icar.official”, “xxxxxx” }, “@id” : “https://....”, “@context” : “<schema url>”, “@type”: “icarAnimalCore” }
JSON-Hypermedia
“sire” : { “id”: {“scheme”: “org.icar.official”, “xxxxxx” }, “links”: [{“href” : “https://....”, “rel” : “self”, “targetSchema”: “icarAnimalCore” }] }
(Assuming rel:self is ok because the link is inside the reference to the sire)
On 27 June 2019, Craig Vigors wrote:
I wasn’t at ICAR, so I may be a little out of the loop. My thoughts on the points below.
1: I have no real preference on which standard is used, as long as a standard is used. I wouldn’t have worked with Hypermedia or LD in the past so I wouldn’t be making an informed decision. We should have hyperlinking included though using some standard. I have always had a slight curiosity with how far you go with the hyperlinking, when it comes to API performance. For example, with you have an API that provides a list of animals currently in the herd, would you provide a link to each animals’ insemination / calvings / movements etc. If the animal doesn’t have an insemination, or access to that data is restricted for that client, does the link get provided anyway? In order to prevent the link being included, you would need to check if there is data available. That comes at a cost of performance.
2: “Arrival/Departure: Origin, Destination, transporter/Haulier, transport reference number, vehicle registration, date/time loaded, date/time unloaded, farm assurance reference” This seems little over the top to me, and we wouldn’t be collecting it. What is it used for? GDPR would potentially impact a lot of those fields for us.
“Death: Reason, disposal method, disposal reference/receipt number” Sometimes reason for death has come up as an interest to certain parties, but not everyone. I wonder how often it is known and to what accuracy? “Registration: All the animal fields that are necessary for first registration”. I don’t understand here, are you saying that this data isn’t included? Is there another service that provides it?
3: The array of parents would scale well, and there would be a preference for 3 generations for pedigree animals.
On 27 June 2019, Andrew wrote:
Really good thoughts and questions, thank you.
Hypermedia
Additional data in events
Death Reasons – good question. We’ve talked about Reason just being a text field (seems to be widely used), but it would be interesting to get feedback on others about what is recorded.
Registration – All events refer to the animal by ID. When recording a registration for an animal, you need to provide many of the other animal fields (although the list varies by organisation of course). My intent is that a Registration event would embed an icarAnimalCore object so those fields can be provided. It is the one case where embedding the animal object into the event makes sense.
Thanks for the feedback on the array of parentage, that’s helpful.
At the meeting on 28 June 2019 it was felt that either JSON Hyper Schema or JSON-LD were reasonable, and I undertook to investigate further JSON Hyper Schema.
An issue with JSON Hyper Schema is that it is designed principally to be declarative at the time of schema/API definition, and doesn't completely lend itself to cases where the implementation and format of URLs is not known or there may be a number of implementations. This is because links are defined as URI Templates in terms of RFC 6570. The client is responsible for resolving the URL template to a URL as follows:
This is great, but it does lend itself to URIs being specified in only one way, using one protocol, which is not what we are trying to achieve. It is hard to override this for specific implementations, as the URI Template is specified in the schema. However, I believe JSON Hyper Schema could be used if care is taken in how we specify the URI Templates.
For instance:
"Links": ["
{ "rel": "self", "href": "{@id}" }
]
Would specify that there was a property in the object called @id, which contains a URI to the object itself (useful if you want to GET or PATCH a single object). I've used @id here as an example that some of you might recognise from JSON-LD.
In contrast, if we decided that URI paths were to be the same for all possible implementations, we could use:
"Links": ["
{ "rel": "self", "href": "animals/{id}" }
]
Which would say that there was a property called "id", and all animals could be found at the relative path starting with "animals/".
As many of you know, I prefer the former approach (not specifying the exact URL paths in the schema which applies to every implementation).
In terms of link relations (the "rel" part of a link), the following from the IANA registry are likely to be useful to us: "self" Specify a link to an object "collection" URL to the array of these objects (so I can get many of them, or POST a new one) "edit" Assuming you use PUT/PATCH to edit, this allows specifying a schema for edits
Then inside collections: "first" Return the first page of a paginated collection "next" The next page of a paginated collection "prev" The previous page of a paginated collection "last" The last page of a paginated collection
We may need to define our own link relations to describe links to:
I have defined:
The collection changes referenced above have been addressed by #73.
We implemented Link Description Objects ("links" array) in icarResourceReference and icarResourceCollection, but because we have multiple files included with $ref, and hence can't include the "schema" keyword, these are not recognised and fail validation.
We have removed the "links" array as the links themselves are still clearly documented in the schema without this. See commit #74.
There is a need for automated discovery of urls. Parties that connect to multiple servers should not rely on client side synthesizing urls. Instead, the server should provide links in the messages to related resources.
The workgroup should choose such a standard (e.g. JSON-ld or hateoas or something else).
As long as we don’t have committed to this, we need temporary workarounds like example url schemes to document the standard.