rethinkdb / rethinkdb

The open-source database for the realtime web.
https://rethinkdb.com
Other
26.67k stars 1.86k forks source link

Proposal: db and table config templates #5723

Open marshall007 opened 8 years ago

marshall007 commented 8 years ago

Elasticsearch has the concept of index templates which allow you to automatically apply configuration when new indices are created. I think it would be really nice to have something similar in RethinkDB.

The templates could be stored in a new system table, perhaps config_templates. This example would probably be the most common use-case. Ensuring all tables in your cluster are created with the same number of shards and replicas by default:

{
  "db": "*",
  "table": "*",
  "settings": {
    "shards": 3,
    "replicas": 3
  }
}

The next example illustrates how this could be useful for dynamically created tables. In this case, we have tables acting as queues. Each time we create a new table in job_queues, it will automatically have a standard configuration applied:

{
  "db": "job_queues",
  "table": "*",
  "settings": {
    "primary_key": "job_id"
  },
  "indexes": [{
    "index": "status"
    "function": r.row('status'),
    "multi": false,
    "geo": false
  }],
  "grant": [{
    "username": "job_handler",
    "read": true,
    "write": true
  }]
}

Note that it's possible for multiple config templates to be applied when a table is created. Order should be deterministic, probably based on the specificity of db/table names in the template. Options passed directly into dbCreate and tableCreate would have the highest priority.

danielmewes commented 8 years ago

Thanks for writing up this proposal @marshall007 .

This would be a solution to https://github.com/rethinkdb/rethinkdb/issues/3335 .

I quite like this. I think the general priorities of conflicting rules should be the same as in the permissions table (i.e. like you say, the more specific entry wins).

I suggest that rather than having a field table: "*" as a wildcard, the table field is completely omitted from entries that apply to the whole database. This again would be consistent with the permissions table, as well as with the stats table.

danielmewes commented 8 years ago

We need to figure out a way for storing the index functions in this table, since we currently can't have functions as fields inside of a document. So that part would depend on https://github.com/rethinkdb/rethinkdb/issues/1863 .

Even if we don't allow specifying default indexes at first, I think this could be useful though.

marshall007 commented 8 years ago

@danielmewes #1863 is obviously the preferable and long-term solution. In the mean time I was thinking we could just store the binary representation. When inserting into config_templates, function could accept either a valid secondary index expression or the binary representation; just like .indexCreate(). When retrieving rows from config_templates you would always just get the binary back, though, not an expression the drivers can deserialize.

This would keep us in line with the output of .indexStatus() and work essentially the same as how the Python admin scripts currently dump/restore secondary indexes.

marshall007 commented 8 years ago

I suggest that rather than having a field table: "*" as a wildcard, the table field is completely omitted from entries that apply to the whole database.

I mostly agree for the sake of consistency with the permissions table. It will be a little odd to have things like indexes and settings.primary_key specified without a table designator, since those aren't applied at the database level. It's not that out of place though since technically shards/replicas are both table settings too.

Would you say the db field should also be omitted in the wildcard case?

Note that Elasticsearch supports suffix and prefix wildcards as well, which is very useful for things like time-series and log rotation where you suffix the table name with a timestamp. If we're going to support things like table: "logs_*", defaulting to "*" might be a little more clear. I don't have very strong opinions though.