limpyd / redis-limpyd

Provide an easy way to store python objects in Redis, without losing the power and the control of the Redis API
https://redis-limpyd.readthedocs.org/
Do What The F*ck You Want To Public License
72 stars 11 forks source link

[contrib] Add multi-indexes composition + DateTimeIndex #103

Closed twidi closed 6 years ago

twidi commented 7 years ago

This PR depends on #101

Multi-indexes allow to easily create complex indexes.

Here is a excerpt of doc/contrib.rst about the creation of the DateTimeIndex:


DateTimeIndex

The limpyd.contrib.indexes module provides a DateTimeIndex (and other friends). In this section we'll explain how it is constructed using only the configure method of the normal indexes, and the compose method of MultiIndexes

Goal

We'll store date+times in the format YYYY-MM-SS HH:MM:SS.

We want to be able to:

Date and time parts

Let's separate the date, and the time into YYYY-MM-SS and HH:MM:SS.

How to filter only on the year of a date: we want to extract the 4 first characters, and filter it as number, using NumberRangeIndex:

Also, we don't want uniqueness on this index, and we want to prefix the part to be able to filter with myfield__year=2015

So this part could be:

>>> NumberRangeIndex.configure(prefix='year', transform=lambda value: value[:4], handle_uniqueness=False, name='YearIndex')

Doing the same for the month and day, and composing a multi-indexes with the three, we have:

>>> DateIndexParts = MultiIndexes.compose([
...     NumberRangeIndex.configure(prefix='year', transform=lambda value: value[:4], handle_uniqueness=False, name='YearIndex'),
...     NumberRangeIndex.configure(prefix='month', transform=lambda value: value[5:7], handle_uniqueness=False, name='MonthIndex'),
...     NumberRangeIndex.configure(prefix='day', transform=lambda value: value[8:10], handle_uniqueness=False, name='DayIndex'),
... ], name='DateIndexParts')

If we do the same for the time only (assuming a time field without date), we have:

>>> TimeIndexParts = MultiIndexes.compose([
...     NumberRangeIndex.configure(prefix='hour', transform=lambda value: value[0:2], handle_uniqueness=False, name='HourIndex'),
...     NumberRangeIndex.configure(prefix='minute', transform=lambda value: value[3:5], handle_uniqueness=False, name='MinuteIndex'),
...     NumberRangeIndex.configure(prefix='second', transform=lambda value: value[6:8], handle_uniqueness=False, name='SecondIndex'),
... ], name='TimeIndexParts')

Range indexes

If we want to filter not only on parts but also on the full date with a TextRangeIndex, to be able to do date_field__gte=2015, we'll need another index.

We don't want to use a prefix, but if we have another TextRangeIndex on the field, we need a key:

>>> DateRangeIndex = TextRangeIndex.configure(key='date', transform=lambda value: value[:10], name='DateRangeIndex')

The same for the time:

>>> TimeRangeIndex = TextRangeIndex.configure(key='time', transform=lambda value: value[:8], name='TimeRangeIndex')

We don't keep theses two indexes apart from the DateIndexParts and TimeIndexParts because we'll need them independently later to prefix them when used together.

Full indexes

But if we want full indexes for dates and times, including the range and the parts, we can easily compose them:

>>> DateIndex = MultiIndexes.compose([DateRangeIndex, DateIndexParts], name='DateIndex')
>>> TimeIndex = MultiIndexes.compose([TimeRangeIndex, TimeIndexParts], name='TimeIndex')

Now that we have all that is needed for fields that manage date OR time, we'll combine them. three things to take in consideration:

In the first time, we'll want an index without the time parts, to allow filtering and the three "ranges" (full, date, and time), but only on date parts, not time parts. It can be useful if you know you won't have to search on this.

So, to summarize, we need:

Which gives us:

>>> DateSimpleTimeIndex = MultiIndexes.compose([
...     TextRangeIndex.configure(key='full', name='FullDateTimeRangeIndex'),
...     DateRangeIndex.configure(prefix='date'),
...     DateIndexParts,
...     TimeRangeIndex.configure(prefix='time', transform=lambda value: value[11:])  # pass only time
... ], name='DateSimpleTimeIndex', transform=lambda value: value[:19])  # restrict on date+time

And to have the same with the time parts, simply compose a new index with the last one and the TimeIndexPart:

>>> DateTimeIndex = MultiIndexes.compose([
...     DateSimpleTimeIndex,
...     TimeIndexParts.configure(transform=lambda value: value[11:]),  # pass only time
... ], name='DateTimeIndex')

And we're done !

twidi commented 6 years ago

auto reviewed. some remarks about doc/docstrings