Closed jgaehring closed 3 years ago
Just want to save this here for future reference: https://github.com/farmOS/farmOS/blob/2.x/modules/core/entity/farm_entity.base_fields.inc
Did a little looking around for JSON Schema data generators, but there are surprisingly few options.
The official list of implementations only lists a Python library: https://json-schema.org/implementations.html#data-from-schemas
AJV seems like a tried and tested validator, but only seems to do that, not generation: https://ajv.js.org/
There's a library called dreamjs that does generation but it's not been updated in 4 years and has a big bundle size: https://github.com/adleroliveira/dreamjs
That all taken into consideration with the particulars of our use case means we'll probably be better off with a custom solution.
I'm noticing some peculiarities about the JSON Schema documents, particularly for logs. (For reference, I'm looking at http://localhost/api/log/activity/resource/schema on my local server).
First and foremost, it doesn't include the log's id
. Here's the schema, not including the definitions
for the $refs
:
{
"$schema": "https://json-schema.org/draft/2019-09/hyper-schema",
"$id": "http://localhost/api/log/activity/resource/schema",
"title": "Activity log",
"allOf": [
{
"type": "object",
"properties": {
"type": {
"$ref": "#definitions/type"
},
"attributes": {
"$ref": "#/definitions/attributes"
},
"relationships": {
"$ref": "#/definitions/relationships"
}
}
},
{
"$ref": "https://jsonapi.org/schema#/definitions/resource"
}
]
}
You'll notice there are only 3 top level properties: type
, attribute
and relationships
. Shouldn't id
be there as well? I also don't see links
, but I'm perfectly fine with that being excluded, because I can't think why I'd need that anyway.
Also, the required
properties for attributes
don't fully make sense to me:
{
"required": [
"drupal_internal__id",
"drupal_internal__revision_id",
"status",
"revision_translation_affected"
]
}
What are they required for, exactly? They're not required in POST
requests, I know that. I interpret required
to mean required by JSON:API, otherwise, what relevance is it to the JSON Schema? For relationships
, meanwhile, there are no required
properties, even for a seeding
log, which if I recall, requires a planting asset, no?
Anyways, I'll report more issues as I encounter them. @mstenta, I'd love your input on this, but it's by no means urgent.
First and foremost, it doesn't include the log's
id
Huh - yea I would expect ID to be there too, for the sake of completeness. Can't think why it wouldn't be included.
Might be worth opening an issue upstream to inquire about it: https://www.drupal.org/project/issues/jsonapi_schema
Practically speaking WE know the id
will be a string (UUID) - so we can hard-code that knowledge if we have to.
Also, the required properties for
attributes
don't fully make sense to me: What are they required for, exactly? meanwhile, there are no required properties, even for a seeding log, which if I recall, requires a planting asset, no?
So actually: seeding logs do NOT require a planting asset anymore! This is something we had been considering changing in 1.x. So that's correct.
The status
attribute IS required for assets and logs, and should be required to POST
. This replaces 1.x's done
boolean for Logs (now values should be pending
or done
strings), and it replaces archived
for Assets (now values should be active
or archived
, and we have an additional archived
field for storing the timestamp that it was archived, much like created
and changed
timestamps).
As for drupal_internal__id
, drupal_internal__revision_id
, revision_translation_affected
- they are required to be set in the farmOS database - but not required for POST
operations, because they will be automatically set by Drupal (I assume - have you confirmed this?).
This does raise an interesting question about "required" status as it relates to the API and POST
requests. I wonder if there is some way to distinguish POST
required fields.
Maybe this too would be worth opening an issue for upstream - if only to ask whether or not there's a reason for the Drupal internal ones to be included. My suspicion is that it is simply taking that status directly from the lower-level Drupal field API, where those fields ARE technically required in the database.
I filed an issue upstream to inquire about these two details specifically: https://www.drupal.org/project/jsonapi_schema/issues/3190497
How urgent is this? Will it block us from moving forward? (edit: nevermind just saw your "it's by no means urgent.") :-)
Thanks for opening the issue. It does seem odd, right?
Practically speaking WE know the id will be a string (UUID) - so we can hard-code that knowledge if we have to.
Totally. Hardcoding is not ideal, but at least this is a very solid assumption to make across the board. Shouldn't be too tricky.
So actually: seeding logs do NOT require a planting asset anymore!
Oh! Great to hear! I wonder, would it be worth reviewing some of the more common log types that do still have required fields and just checking to see if they're being picked up by JSON Schema?
The
status
attribute IS required for assets and logs, and should be required toPOST
.
Interesting. I'll have to try a test w/o the status to see what happens. Will let you know.
As for
drupal_internal__id
,drupal_internal__revision_id
,revision_translation_affected
- they are required to be set in the farmOS database - but not required forPOST
operations, because they will be automatically set by Drupal (I assume - have you confirmed this?).
I have not set them in my POST
requests and have not run into any failed requests. So yea, seems weird that the JSON Schema would be communicating requirements which do not pertain to the REST API.
Interesting. I'll have to try a test w/o the status to see what happens. Will let you know.
Oh you know what: I bet this is not "hard" required because it has a "default value" of "pending" for new logs (so it gets auto-filled). I wonder if JSON Schema shows the default values anywhere?
Huh - just looked at /api/asset/animal/resource/schema
and I see that in addition to the fields you mentioned above, name
and geometry
are also listed as required.
name
makes sense - that is required for assets, and unlike Logs it will not be automatically generated (which is why it's not technically required for Logs).
I'm not sure why geometry
is showing as required though. That is one of the two computed fields (geometry
and location
). I'll see if I can figure out why that's happening...
I'll have to try a test w/o the status to see what happens. Will let you know.
I actually checked and realized my existing test already sends a log w/o the status field:
So I guess that requirement is not being enforced in POST
requests. :thinking:
I wonder if JSON Schema shows the default values anywhere?
It does, but not for the status
field. Here's an interesting snippet from the log--activity
schema:
{
"name": {
"type": "string",
"title": "Name",
"maxLength": 255,
"description": "The name of the log. Leave this blank to automatically generate a name.",
"default": ""
},
"timestamp": {
"type": "number",
"title": "Timestamp",
"format": "utc-millisec",
"description": "Timestamp of the event being logged."
},
"status": {
"type": "string",
"title": "Status",
"maxLength": 255,
"description": "Indicates the status of the log."
}
}
Definitely some weird stuff going on. I'd be really curious to hear from the module authors exactly how they interpret the required
keyword. Checking the spec, it just says it's used to determine if a JSON document is "valid" or not. There doesn't seem to be any standard interpretation for what kind of behavior a REST API should take with valid or invalid documents. So again, yea, I'd love to hear the authors take on how the JSON Schema is meant to represent the API's internal validation procedures, generally.
I opened a similar issue in jsonapi_schema
a while back pertaining to the schema enum
for entity types: https://www.drupal.org/project/jsonapi_schema/issues/3174991
That issue points out that an enum
is provided for the log.equipment
asset reference field, to specify that only the asset--equipment
bundle is allowed. But when all bundles are allowed on an entity reference field (such as log.asset
), the enum
is not populated.
For the status
field, there could be an enum
of pending
and done
. If additional states were added by contrib modules, then these would become visible in the schema & known to 3rd parties (very useful!)
That's just to say, this all seems connected.. for our status
field (which is a state
machine field type), it seems that JSONAPI schema doesn't use the allowed_values
or default_value
properties defined by that field type. Seems like this should be possible?
Hmm also one consideration re: the drupal_internal__id
field. For config entities this field's value is a string, but for content entities it is an integer. In the context of 3rd party API integrations it seems pretty important to include this field to allow the ability to query config entities by their non-UUID ID. Not sure if this is an issue @jgaehring (maybe in a different issue, but I think you mentioned throwing out the drupal specific fields) ?
If a FM wanted to create a log with the monitor
flag, it has to query those resources by drupal_internal__id
, not label, in order to find the correct UUID.
Example config entities in farmOS 2.x are log types, asset types, and flags.
An example flag:
{
"type":"flag--flag",
"id":"8cb92703-d054-4c62-bfcc-ee3bad060bdc",
"links":{
"self":{
"href":"http://localhost/api/flag/flag/8cb92703-d054-4c62-bfcc-ee3bad060bdc"
}
},
"attributes":{
"langcode":"en",
"status":true,
"dependencies":{
"enforced":{
"module":[
"farm_flag"
]
}
},
"drupal_internal__id":"monitor",
"label":"Monitor"
}
},
For the
status
field, there could be anenum
ofpending
anddone
. If additional states were added by contrib modules, then these would become visible in the schema & known to 3rd parties (very useful!)
Oh yea! I hadn't noticed that, but you're totally right!
That's just to say, this all seems connected.. for our
status
field (which is astate
machine field type), it seems that JSONAPI schema doesn't use theallowed_values
ordefault_value
properties defined by that field type. Seems like this should be possible?
Agreed, definitely seems connected. I'll leave it to you and Mike to determine how possible any of this is. Do you think the module authors are accepting contributions?
If a FM wanted to create a log with the
monitor
flag, it has to query those resources bydrupal_internal__id
, not label, in order to find the correct UUID.
Ohhhhhh, good to know!
(maybe in a different issue, but I think you mentioned throwing out the drupal specific fields) ?
I had. Although on further consideration, I'm more leaning towards moving those fields into meta-data, so the information isn't lost, but hidden from routine API users. This definitely strengthens the argument for keeping those fields.
Totally bikeshedding here, but I think I'm going to rename this farmModel
, or farm.model
. Only slightly less vague, compared to farmData
, but more to the point, I think.
In an effort to break out of some analysis paralysis I've been having, I want to list out the various decisions I've been struggling with, and try to answer them one-by-one in subsequent comments:
attributes
and relationships
objects?meta
field?attributes
and relationships
objects?meta
field?Serialized format
Link references to other entities via Proxies (as mentioned in #401)?
no
Use attributes and relationships objects?
yes
Colocate metadata with primary data?
no
Move metadata to meta field?
yes
In-memory format
Use custom getters and setters?
yes
Use JS Symbols?
no
Multiple helpers for accessing metadata?
TBD
Use attributes and relationships objects?
no
Co-locate metadata with primary data?
no
Move metadata to meta field?
yes
Generating default objects from JSON Schema
Use Cambria?
no
Use json-schema-tools?
no
Use json-schema-defaults?
no
Roll my own?
roll my own defaults stub
Reminder to myself to address #418 before closing this.
Closing this now, because I think all the main features are implemented for all entities and just need to be integrated with farm.connect
and Field Kit.
The second step in providing support for farmOS 2.x will be in extending the existing
farmLog
library for farmOS 1.x, currently a part of field kit itself, so that it's capable of creating, processing and formatting all entities, not just logs, in farmOS 2.x. The other big difference here is that we will finally be able to use JSON Schema, in place of the ad hocdata_schema
we were using before, which should provide a lot more powerful and safer forms of validation, but may be more complex to implement. I should probably look into using some kind of JSON Schema validation library, if I can, rather than writing my own logic for generating defaults and validating data.