cortoproject / corto

A hierarchical object store for connecting realtime machine data with web applications, historians & more
https://www.corto.io
MIT License
87 stars 14 forks source link

Refactor corto_frame to describe time frame and pagination requests #631

Closed hendrenja closed 6 years ago

hendrenja commented 6 years ago

InfluxDB introduces the requirement to support querying historical datasets at a specific time frame and returning a subset of the samples in the time frame.

Query 10 time > FRIDAY:0800 AND time < FRIDAY:1700 SLIMIT 100 SOFFSET 1000 Query 11 time > FRIDAY:0800 AND time < FRIDAY:1700 SLIMIT 100 SOFFSET 1100

corto_frame timeBegin;
corto_frame timeEnd;
struct corto_frame {
    corto_frameKind kind;
    int64_t value;
};
typedef enum corto_frameKind {
    CORTO_FRAME_NOW = 0,
    CORTO_FRAME_TIME = 1,
    CORTO_FRAME_DURATION = 2,
    CORTO_FRAME_SAMPLE = 3,
    CORTO_FRAME_DEPTH = 4
} corto_frameKind;

The above structure allows mount developers to express queries for samples across a timeframe or depth. It would be ideal if corto_frame was updated to express time and depth. Perhaps the easiest approach would be a bitmask?

SanderMertens commented 6 years ago

If it were to be changed to a bitmask, a frame still would not be able to indicate what the limit (depth) and offset values of a query should be.

I think this capability should be mapped orthogonal to a frame. This means that the frameKind no longer will have CORTO_FRAME_SAMPLE and CORTO_FRAME_DEPTH, but instead these will become separate parameters of a query. Following the format of influxdb, this could look like this:

// If no timeframe is specified, take the offset & limit starting from "now"
corto_select("*").soffset(1100).slimit(100).iter(&it);

// Select at most 100 samples starting from time t
corto_select("*").fromTime(&t).slimit(100).iter(&it);

// Select samples 1100 until 1200 within the selected frame
corto_select("*").fromTime(&t).forDuration(&duration).soffset(1100).slimit(100).iter(&it);
SanderMertens commented 6 years ago

As an aside, I'd like to simplify the history query API for corto_select so that is more readable, and bring it more in line with something that could be unified with "TreeQL" query strings.

I'm thinking something along the lines of:

// Select from a date (translated to timestamp) and duration
corto_select("*").from("/foo").start("Nov 10 2017 18:35").duration("12h").slimit(100).iter(&it);

// Select from now with a duration
corto_select("*").from("/foo").start("now").duration("1d").iter(&it);

// Select between two timestamps
corto_select("*").from("/foo").start("14453458").end("14453558").slimit(100).iter(&it);

The data communicated to the mounts stays the same as corto will take care of parsing the strings and converting them to a timestamp.

In a TreeQL query, this could be written down like:

select * from /foo start "now" duration "1d"
hendrenja commented 6 years ago

@SanderMertens I agree with your suggestions. I forgot about the value piece of the corto_frame ;).

SanderMertens commented 6 years ago

Fixed in https://github.com/cortoproject/corto/commit/2ebdea85edf4f4f7661b9a2aa4909432039d3f19