i2mint / py2store

Tools to create simple and consistent interfaces to complicated and varied data sources.
MIT License
11 stars 2 forks source link

Hypercubic keys and slicing #33

Open thorwhalen opened 4 years ago

thorwhalen commented 4 years ago

Here we have a multi-dimensional key and want to get "hyper-slices" by queries that fix the values of a subset of the keys.

For example, if keys are triples, the atomic key is a triple (a,b,c) with valid a, b, and c values, and store[a,b,c] would get us the value (i.e. data) under that key if it exists. What hypercubic keys would allow us to do is do things like store[a,b] (which would return a substore such that substore[x] == store[a,b,x] for any x where (a,b,x) in store. Similarly, store[a] would give us a substore of all valid (a,x,y) keys.

In the examples above, we were constrained by a linear hierarchy of keys (a, a,b, a,b,c,...). This constraint (or other types of constraints) are advantageous to implement, but are not necessary. The general case is where any combinations such as store[a,:,b], store[:,:,c], and store[:,b,:] are possible.

Further, one should note that the tuple keys interface used in these examples are not the only choices a user might want to have. Sometimes, it is convenient to express hypercubic keys with dictionaries (e.g. store[{'A': a, 'C': c}]), with strings (e.g. 'ROOT/a/[^/]+/c.+' or 'ROOT/a/{}/c{}'), with namedtuples, or any context-convenient python object that resolves to hypercubic coordinates.

After all, one of the purposes of py2store, and the i2i approach in general, is to provide the tools to produce interfaces that are convenient/natural to any particular context.

The hypercubic keys functionalities should be completely decoupled from any concrete persister, and be able to act as wrapper to any persister. Obviously though, keeping a few archetypical persisters (e.g. local files, mongo, sql) in mind is useful.

thorwhalen commented 4 years ago

I call it "hypercubic keys", but is there a common term for this? Hypercube alternatives: Tensor, Multidimension, ... Key alternatives: Index, coordinates, ...

thorwhalen commented 4 years ago

The DirStore already implements the hypercubic thing, but for a concrete case (local files, string keys, linear constraint).

thorwhalen commented 10 months ago

Relevant to Root interface of hubcap. Contains a good example use case!

A few other terms to describe this: