AntidoteDB / antidote

A planet scale, highly available, transactional database built on CRDT technology
https://www.antidotedb.eu
Apache License 2.0
837 stars 88 forks source link

Mechanism for associating metadata attributes to objects #314

Open dvasilas opened 7 years ago

dvasilas commented 7 years ago

Existing use cases for metadata attributes:

It makes sense to represent metadata attributes as CRDTs and store them in the datastore.

Design options:

Option 1

Store each metadata attribute as a separate object, associated with the data object through their keys: The key format can be object_key separator attribute_name where:

The object itself will be stored under the key object_name.

As an example, the attribute Type of an object named my_counter will be accessed under the key my_counter/Type (assuming separator=/).

An additional attribute for each object can be maintained by the system, listing the names of all the attribute_names associated with the object (implemented as a set-CRDT).

Note: This design will restrict the keyspace visible to the user, as only part of the key will be used of the object name. Object names should not contain the character used as separator.

Option 2

Couple each object with its metadata attributes as a single object: Each object will be stored as a map-CRDT under the key object_name, containing both metadata attributes under map keys corresponding to the attribute_names and the data object under a special map key.

Option 3

For each object there exists an additional metadata object containing all its metadata attributes: Each object will be stored under the key object_name. Its metadata attributes will be stored as a map-CRDT under a key associated with the object_name, such as object_name/md or _object_name.


Any of these designs can be implemented at the protocol buffer interface level. The interface would be extended with:


Note: In order to ensure that objects and their metadata attributes are mapped in the same server, the sharding mechanism can be modified to calculate shards based on a prefix of the key, omitting suffixes used for storing metadata attributes. In that way, the objects my_counter and my_counter/Type will be mapped in the same server.

I propose implementing Option 3 and I can work on it.

marc-shapiro commented 7 years ago

Le 1 sept. 2017 à 17h49, dimitriosvasilas notifications@github.com a écrit :

[…]

As an example, the attribute Type of an object Key1 would be accessed under the key Key1/Type, while the object itself would be accessed under the key Key1.

Actually I suggest the object itself be accessed under the key Key1:concrete_type which ensures the reader/writer knows the actual type of the object. (and so Key1/Type contains the value concrete_type).

Make sure the store API does not allow to access the sub-keys directly, i.e. the API only accepts Key1 and adds the sub-keys itself.

                        Marc
dvasilas commented 7 years ago

@marc-shapiro it appears that this functionality is already in place. Consider the following testcase performing read and write operations:

    ...
    Key1=clocksi_test6_key1,
    BoundObj1 = {Key1, antidote_crdt_counter, ?BUCKET},
    BoundObj2 = {Key1, antidote_crdt_mvreg, ?BUCKET},

    {ok, TxId} = rpc:call(FirstNode, cure, start_transaction, [ignore, []]),

    ok = rpc:call(FirstNode, cure, update_objects, [[{BoundObj1, increment, 1}], TxId]),
    {ok, _Res} = rpc:call(FirstNode, cure, read_objects, [[BoundObj1], TxId]),

    ok = rpc:call(FirstNode, cure, update_objects, [[{BoundObj2, assign, <<"a">>}], TxId]),
    {ok, _Res} = rpc:call(FirstNode, cure, read_objects, [[BoundObj2], TxId]),

    End = rpc:call(FirstNode, cure, commit_transaction, [TxId]),
    ...

The type of an object needs to be specified for reads and writes.

In fact, the execution results in an error when I update an object using one type and then try accessing the same key with a different type.

=== Reason: no match of right hand side value 
                 {badrpc,
                  {'EXIT',
                   {{function_clause,
                     [{antidote_crdt_mvreg,'-downstream/2-lc$^0/1-1-',
...

However, encoding the type information in the object key indeed allows to easily check whether the operation uses the correct object type.

peterzeller commented 7 years ago

It seems to me, that the current proposal could be implemented in client-applications without changing Antidote itself. Are there plans to later use these attributes for things like search / secondary indexes or other use cases which require to implement this directly in Antidote?

dvasilas commented 7 years ago

In fact, this issue came up because i intend to implement secondary indexes on these attributes, and at the same time other use cases are using security/access control attributes.

It would maybe make sense for these works to use a unified interface provided by Antidote for managing these attributes, rather than re-implement similar mechanisms in different ways.

cmeiklejohn commented 7 years ago

Both Lasp and Riak (2i, Yokozuna, Search) have extensions for storing per-object metadata, if you're interested in looking at how those operate.