mediachain / concat

Mediachain daemons
MIT License
42 stars 13 forks source link

Statement Schemas #76

Open vyzo opened 7 years ago

vyzo commented 7 years ago

It's becoming apparent that there is a need to track the schema of the objects contained in a statement, as part of the statement envelope -- aleph/44

Adding support at the statement proto is relatively straightforward: a new field can be added without invalidating existing statements.

Supporting indexing and MCQL queries using the schema can be implemented in two ways:

The first option is more intrusive, as it will require an alteration of the Envelope table. It also has complications with NULL semantics.

The second option is preferrable: straightforward creation of a new table and index (trivial migration), with MCQL support similar to how we query on wki, perhaps with criteria extended to support prefix matching.

vyzo commented 7 years ago

I think this has gotten trickier: there are now nodes operated by the community in the wild, and it's not reasonable to ask them to re-ingest their data to get schemas.

vyzo commented 7 years ago

So we need a migration path that will migrate existing statements to have schemas.

One approach is to iterate through the statement db and republish statements with the schema inferred by the dependency and the schema wki in mediachain.schemas. This should only happen for locally published statements, statements merged from other nodes should not be republished, but instead be remerged (to get the new statements) once the publishers have migrated.

We also need a mechanism for cleaning up those old statements, so perhaps we need to take the Envelope table extension approach and treat schema as an envelope selector in MCQL -- cf NULL semantics.

parkan commented 7 years ago

Hmm vyzo let's punt on this for 1.2 and think about a backfill/migration mechanism

vyzo commented 7 years ago

agreed.