I just wanted to document some of the challenges I have been facing with the codegen based on the JSON schema for Google Datastore and making items "useable". At the moment the codegen deserializes at a protocol level to an intermediate state, bounded by the datatypes that Google Datastore, but does not actually provide "usable"/"logical" objects in JavaScript.
For example, a fundamental type is Value which code gens to something like this:
This provides a much more user friendly set of types that make it really easy to write type guards as well as provide much better code completion experience. It also makes it a lot easier to write a single pass deserialization of the higher order Datastore type to a JavaScript type (because the deserialized value isn't really usable/friendly for use in JavaScript) so I have this deserialisation function:
function datastoreValueToValue(value: types.Value): unknown {
if (isValueArray(value)) {
return value.arrayValue.values.map(datastoreValueToValue);
}
if (isValueBlob(value)) {
return base64.decode(value.blobValue);
}
if (isValueBoolean(value)) {
return value.booleanValue;
}
if (isValueDouble(value)) {
return value.doubleValue;
}
if (isValueEntity(value)) {
return entityToObject(value.entityValue);
}
if (isValueGeoPoint(value)) {
return value.geoPointValue;
}
if (isValueInteger(value)) {
return stringAsInteger(value.integerValue);
}
if (isValueKey(value)) {
return value.keyValue;
}
if (isValueNull(value)) {
return null;
}
if (isValueString(value)) {
return value.stringValue;
}
if (isValueTimestamp(value)) {
return new Date(value.timestampValue);
}
}
Also, when trying to serialise JavaScript objects to Datastore values, there are all sorts of validation logic that are not expressible in the schema. For example for a stringValue:
When exclude_from_indexes is false (it is indexed) , may have at most 1500 bytes. Otherwise, may be set to at most 1,000,000 bytes.
Which are all things that make sense to handle in the abstraction before sending it over the wire and getting a rejection from the API.
In addition, the codegen generates a Datastore class that has not the most usable APIs. For example the "runQuery" API:
class Datastore {
async projectsRunQuery(projectId: string, req: RunQueryRequest): Promise<RunQueryResponse>;
}
The project_id is part of the service account JSON and is tied to the instance of the Datastore, all things which can't be expressed in the schema. Also the RunQueryRequest is a composite object that doesn't really make sense from a usage perspective, so this is what the hand crafted version looks like:
I just wanted to document some of the challenges I have been facing with the codegen based on the JSON schema for Google Datastore and making items "useable". At the moment the codegen deserializes at a protocol level to an intermediate state, bounded by the datatypes that Google Datastore, but does not actually provide "usable"/"logical" objects in JavaScript.
For example, a fundamental type is
Value
which code gens to something like this:Which is a real pain to deal with in a logical type safe way. I hand crafted a more usable type in this fashion:
This provides a much more user friendly set of types that make it really easy to write type guards as well as provide much better code completion experience. It also makes it a lot easier to write a single pass deserialization of the higher order Datastore type to a JavaScript type (because the deserialized value isn't really usable/friendly for use in JavaScript) so I have this deserialisation function:
Also, when trying to serialise JavaScript objects to Datastore values, there are all sorts of validation logic that are not expressible in the schema. For example for a
stringValue
:Which are all things that make sense to handle in the abstraction before sending it over the wire and getting a rejection from the API.
In addition, the codegen generates a
Datastore
class that has not the most usable APIs. For example the "runQuery" API:The
project_id
is part of the service account JSON and is tied to the instance of theDatastore
, all things which can't be expressed in the schema. Also theRunQueryRequest
is a composite object that doesn't really make sense from a usage perspective, so this is what the hand crafted version looks like: