square / metrics

Metrics Query Engine
Apache License 2.0
170 stars 21 forks source link

Clean Up Blueflood Architecture #285

Closed Nathan-Fenner closed 8 years ago

Nathan-Fenner commented 8 years ago

Resolution overlap fetching is now more complicated (but also more complete!)

ASCII art description:

                   EXAMPLE A
      +--------------------------------+
      |<long ago                 today>|
      +--------------------------------+
30sec |                        [======]| availability of
 5min |                 [===========]  | data for given
20min |           [==========]         | resolution
      +--------------------------------+
      |                 (## request ##)| requested timerange (at 20min or coarser resolution)
      +--------------------------------+
30sec |                              !!| 
 5min |                       !!!!!!!  | 3 fetches performed
20min |           !!!!!!!!!!!!         | 
      +--------------------------------+      
                   EXAMPLE B
      +--------------------------------+
      |<long ago                 today>|
      +--------------------------------+
30sec |                        [======]| availability of
 5min |                 [===========]  | data for given
20min |           [==============]     | resolution
      +--------------------------------+
      |                 (## request ##)| requested timerange (at 20min or coarser resolution)
      +--------------------------------+
30sec |                              !!| 
 5min |                           !!!  | 3 fetches performed
20min |           !!!!!!!!!!!!!!!!     | 
      +--------------------------------+      
                   EXAMPLE C
      +--------------------------------+
      |<long ago                 today>|
      +--------------------------------+
30sec |                     [=========]| availability of
 5min |              [==============]  | data for given
20min |      [===================]     | resolution
      +--------------------------------+
      |          (#### request ####)   | requested timerange (at 20min or coarser resolution)
      +--------------------------------+
30sec |                                | 
 5min |                           !!   | 2 fetches performed
20min |          !!!!!!!!!!!!!!!!!     | 
      +--------------------------------+      

The Blueflood implementation has been greatly simplified (in places) to make this more feasible.

The ChooseResolution method uses the above system to (heuristically) select the appropriate resolution, by choosing the finest resolution such that the above procedure is successful.

Problem? Fetches that extend past viable range will error

A query like

select
cpu | transform.moving_average(1hr)
from -1d to now resolution 30s

will now error, since 1d + 1hr is outside the TTL of 30s.

@drcapulet

drcapulet commented 8 years ago

LGTM