Open dennwc opened 6 years ago
More context: #824
How do you like CBOR as serialisation format? It has support in JavaScript, Go, Python and it's fully compliant with JSON.
Another option is to use Gremlin's Bytecode format. Here is the implementation of the Gremlin Bytecode in JavaScript: https://github.com/apache/tinkerpop/blob/75b190665c0689e95847b0f9def145da172e1f9d/gremlin-javascript/src/main/javascript/gremlin-javascript/lib/process/graph-traversal.js Basically: it accumulates the steps to a structure and then uses GraphSON to submit it. GraphSON is very similar to JSON-LD so we can do something simular.
Workplan:
Shape
to a different module (shapes.go
?) so it will be clear which shapes do we have.Name() string
method to the Shape interface
to explicitly get the name of the shape.parse(json_ld) Shape
function that takes JSON-LD structure and returns a Shape object/query
API as an acceptable Content-Type: application/ld+json
I know how to do 2, 3, 4, 5 and 6. 1 is a little more tricky without previous knowledge.
How do you like CBOR as serialisation format?
Yeah, I thinking about using either CBOR or Protobuf. Both have upsides and downsides.
Another option is to use Gremlin's Bytecode format.
I think this will limit us in the long run. We can support that separately if we want to be compatible with Gremlin clients, though.
Workplan:
Is already done, see shape
package. This was my goal for the last few cycles.
That's an easy part :) However the implementation you describe is targeting CBOR/JSON-LD specifically. Protobufs on the other hand may be more efficient for the case when you have a limited set of allowed messages, which is the case with Shapes. MarshalProto
/UnmarshalProto
may be added to the interface to support it.
This will work for simple cases, but not for advanced ones. Imagine a shape holding a pointer to a function. Or a reference to an external data source. I would propose to reject unknown shapes for serialization (return error if attempted). Also, if we want to target JSON-LD specifically here, we should consider dumping our own schema first to see if it works for us or not.
Again, let's not jump to JSON-LD for queries just yet :) It's a great option in the long run, since we can store them in DB that way (see #669), but the server needs to do a lot of work to interpret such query if it comes from HTTP. JSON-LD spec is pretty involved in terms of possible values, forms, etc. Again, I would propose to accept Protobufs with a strict schema first to see what works and what isn't. And then design a solution with JSON-LD in mind as the next step, or as a step toward #669.
Hm... Isn't strict unmarshalling JSON a solved problem in Go? https://gobyexample.com/json (I really don't know as I'm fairly new to the language) I don't feel very comfortable to add Proto messages as our data types story is pretty complex already and we will need to add it as a dependency for clients, while JSON-LD will not require additional tooling. In second thought we can wait with CBOR and just start with bare JSON and then add CBOR option as they have compatible datatypes. Compatibility with Gremlin shouldn't be a goal right now. I just think we can learn from their structures. In a trade-off of simplicity and performance for messaging format, I personally prefer simplicity as the messages are rather small and performance differences are minor.
JSON-LD will not require additional tooling
For the client side, yes, since it will just emit it. But for server side it's a different story. If you compare it to regular JSON, the number of steps necessary to interpret it may make it sub-optimal as a primary format for the queries.
we can wait with CBOR
Adding CBOR is easy if we already support JSON. It also gives a good performance boost in terms of decoding, so I would prefer it in the long run. But yes, we can start with JSON for now.
I don't feel very comfortable to add Proto messages as our data types story is pretty complex already
I would say Protos usually simplify the datatypes story a lot, since you get everything auto-generated. Plus they are easier to interpret and decode because the strict schema is baked in the format itself, as opposed to JSON/CBOR which are schema-less by design.
But yes, for JS it's needlessly painful for some reason, which is really a shame. I usually end up writing a proto decoder in JS by hand by translating the generated Go code. All of this is just to avoid dependencies :( So we will have to support JSON anyway, I guess. Hope the story with CBOR is better than with Protos at least.
I will look into server-side support for things like OpenAPI to see if it will be useful in this case.
By JSON-LD I was primarily talking about RDF terms representation (as we’ve done in Gizmo). Everything else is just regular JSON
To generate JSON validation we can use something like https://github.com/alecthomas/jsonschema
By JSON-LD I was primarily talking about RDF terms representation (as we’ve done in Gizmo). Everything else is just regular JSON
This certainly works for serialization, but not for the deserialization, as mentioned above.
To generate JSON validation we can use something like https://github.com/alecthomas/jsonschema
Was thinking about using this exact library as well :)
Make Shapes serializable. This will allow to pass them over wires as continuation tokens, distribute queries and even to implement virtual predicates.