Closed aaronc closed 3 years ago
For storing metadata I would prefer to not use JSON. As discussed in other place we will need to limit the metada space anyway. I would use FlatBuffers or Cap’n Proto
For storing metadata I would prefer to not use JSON. As discussed in other place we will need to limit the metada space anyway.
I would use FlatBuffers or Cap’n Proto
Why not just stick with protobuf then?
Yes, probably even better. My point is to have some preferred tool which we will use in documents, tutorials etc... this will somehow drive an adoption.
At the end we will need to have a minimal schema to take any sense of data in postgresql. So for PoC: let's store as protobuf, the emitter will use same scheme to decode data and send it to postges. In the future we can:
Okay, protobuf, json or whatever I think doesn't matter too much for PoC. My reasoning around JSON is that GeoJSON is pretty well-defined and that's the bulk of the payload. If we try to do protobuf we'll end up needing to choose some other geo-encoding, maybe EWKB. I don't really care too much at this stage. I'm mainly trying to see can we get something up and running that shows the credits we'll be issuing soon on the devnet.
In the longer term, I've always thought we should strongly aim for a format that aligns with the RDF data model. Ideally that would be some binary serialization on-chain and not just JSON-LD text. I actually coded a PoC of this data model and serialization format a year or two ago but it will take more work for me to feel like it's complete so right now I'm just aiming to get something we can play with...
In Postgres we store the data in a proper geo-spatial type, note a GeoJSON. There are functions to convert GeoJSON to Postgres data format.
Instead of polygon
we can have location: GeoJSON
field which can be a point, polygon or a box. When processing it we can save it to a proper table column (point, polygon ....) - with that we will be able to do proper queries.
Postgres uses EWKB which we can consider as well.
Generally we will only be dealing with polygons it multi polygons. I can't think of a use case where we'd use points or lines for credits.
Generally we will only be dealing with polygons it multi polygons. I can't think of a use case where we'd use points or lines for credits.
Yeah I agree polygons or multi polygons represent the main use cases afaik.
In the future we can:
- extend the credit_class and batch_class scheme so we will have only one type
@robert-zaremba not sure to get what you mean by that?
@aaronc I was thinking about a point as well. My motivation was to experiment with 2 data types, and polygon and point are 2 basic primitives to represent a location.
BTW: we don't have multi polygons. We can represent it as an array of polygons.
Self note for EWKT: https://docs.snowflake.com/en/sql-reference/data-types-geospatial.html
- extend the credit_class and batch_class scheme so we will have only one type
@robert-zaremba not sure to get what you mean by that?
@blushi today all this data is scheme less, stored in a binary array. We just say: create an object with name, and type and serialize it using protobufs. Then when processing, we can try & catch with few formats (protobufs, JSON, msgpack ...) if a user didn't obey the instructions. But once we clarify the base required arguments, it will be better to add it to a message type and have a proper scheme.
@aaronc , @aaronc So, for the PoC is there any preference for a geo location format (GeoJSON, EWKT..) in the request message (MsgCreateBatchRequest
)?
@aaronc , @aaronc So, for the PoC is there any preference for a geo location format (GeoJSON, EWKT..) in the request message (
MsgCreateBatchRequest
)?
I would say GeoJSON is best for PoC because it had better client side support. @blushi ?
When we settle on a standardized schema we probably want some efficient binary representation like EWKB or a custom protobuf type. I forget about whether EWKB is well supported in JavaScript.
I would say GeoJSON is best for PoC because it had better client side support. @blushi ?
I'd say so too, it can be used as is with Mapbox for instance which is what we have been using so far for the registry.
The current
ecocredit
design specifies metadata as simplybytes
allowing us to iterate on the actual schema off-chain.I propose the following simple JSON structures as simple proof-of-concept metadata.
For credit classes:
For batches:
From what I understand this would give us some bare minimum data. Then in Postgres we could use PostGIS to index batches based on polygon and dates (/cc @robert-zaremba)
Any upgrades you could suggest @blushi ?