n-young / trustdb

0 stars 1 forks source link

Implement ResultSet packing #13

Closed desmondcheongzx closed 3 years ago

desmondcheongzx commented 3 years ago

This pr addresses the error in #6. Additionally, it implements the predicate push-up optimisation as described in #5, i.e. we delay filtering data points as far as possible until the number of series to filter is minimised.

Query evaluation now works on the following test case. Say we're given 3 data points:

{"Write": {"name": "disk", "labels": {"hostname": "host_desmond"}, "variables": {"total": 10}, "timestamp": "2016-01-01 00:04:00+00:00"}}
{"Write": {"name": "disk", "labels": {"hostname": "host_desmond"}, "variables": {"total": 2}, "timestamp": "2016-01-01 00:05:00+00:00"}}
{"Write": {"name": "disk", "labels": {"hostname": "host_desmond"}, "variables": {"total": 4}, "timestamp": "2016-01-01 00:06:00+00:00"}}

And the following select query:

{"Select": {"name": "test", "predicate":{"name": "test","condition":{"Leaf":{"lhs": {"LabelKey": "hostname"},"rhs": {"LabelValue": "host_desmond"},"op": "Eq"}}}}}

This gives us the correct sorted results:

Received statement: Select { name: "test", predicate: Predicate { name: "test", condition: Leaf(Condition { lhs: LabelKey("hostname"), rhs: LabelValue("host_desmond"), op: Eq }) } }
Received result: [Record { name: "disk", labels: {"hostname": "host_desmond"}, variables: {"total": 1099511627776.0}, timestamp: 2016-01-01T00:04:00Z }, Record { name: "disk", labels: {"hostname": "host_desmond"}, variables: {"total": 2.0}, timestamp: 2016-01-01T00:05:00Z }, Record { name: "disk", labels: {"hostname": "host_desmond"}, variables: {"total": 4.0}, timestamp: 2016-01-01T00:06:00Z }]

. Additionally, we can evaluate more complex queries such as with an "AND" statement here

{"Select": {"name": "test", "predicate":{"name": "test","condition":{"And":[{"Leaf":{"lhs": {"LabelKey": "hostname"},"rhs": {"LabelValue": "host_desmond"},"op": "Eq"}},{"Leaf":{"lhs": {"Variable": "total"},"rhs": {"Metric": 4},"op": "Lt"}}]}}}}

which returns the correct single data point:

Received statement: Select { name: "test", predicate: Predicate { name: "test", condition: And(Leaf(Condition { lhs: LabelKey("hostname"), rhs: LabelValue("host_desmond"), op: Eq }), Leaf(Condition { lhs: Variable("total"), rhs: Metric(4.0), op: Lt })) } }
Received result: [Record { name: "disk", labels: {"hostname": "host_desmond"}, variables: {"total": 2.0}, timestamp: 2016-01-01T00:05:00Z }]
desmondcheongzx commented 3 years ago

@n-young rebased and ready for review!

n-young commented 3 years ago

LGTM! Sorry this took so long to review :(