timbray / quamina

Home of Quamina, a fast pattern-matching library in Go
Apache License 2.0
395 stars 20 forks source link

SQL-like Expression Language for patterns #79

Open embano1 opened 2 years ago

embano1 commented 2 years ago

In a discussion with a colleague, were I was showcasing quamina, he asked whether the quamina authors considered SQL(-like) expressions instead of JSON patterns, e.g. using CloudEvents SQL Expression Language.

Example:

// input event
{
    "source": "commerce.mysource.example.com",
    "type": "order.created",
    "id": 123
}
// quamina pattern to match on source prefix and type set
{
    "source": [{"shellstyle": "commerce*"}],
    "type": ["order.created", "order.updated", "order.canceled"]
}
// equivalent cesql expression
cesql: "source LIKE 'commerce*' AND type IN ('order.created', 'order.updated', 'order.canceled')"

Without reading the quamina pattern match specification, SQL-like expressions generally are more readable and explicit, e.g. AND. Also, SQL is a widely used and accepted standard, lowering the bar for adopters.

Note that a SQL-like dialect does not have to replace the existing JSON pattern match implementation, but could augment it.

timbray commented 2 years ago

Doesn't sound crazy. Empirically, the groups using Quamina's big sister inside AWS seemed to like the pattern language, basically 100% of our complaints were about wanting new matching features and smaller memory size. To be honest, I have no opinion, we thought the pattern language up in a hurry.

The easiest way to do such a thing would be translate the SQL-style language into the existing pattern language? But it doesn't sound too horribly difficult to go directly from SQL-like language to the underlying Quamina finite automata.

evankanderson commented 2 years ago

Hi, I'm that colleague. I don't have a strong feeling about whether there's an IR (intermediate representation) which looks like the current language, or whether the direct interface is the SQL pattern builder. If SQL were not quite so common, I might suggest the IR as a better choice so that people could build simple UIs exposing the match patterns, but it's probably equally easy to take that UI and build a SQL pattern at this point.

One interesting question between SQL and IR is whether or not the Quamina implementation stores and can return the SQL representation, or whether it's an input convenience which is discarded on storage.

Note that a SQL representation may also lend support for more complex "OR" queries, like source = 'billing.invoice.po' OR (source LIKE 'billing.*' AND value > 10000.0 ) (to find interesting audit items). I don't know if that's desired or not.

jsmorph commented 2 years ago

Incidentally I have a side project which compiles a SQL SELECT statement from a Quamina pattern. The target database should have an inverted-index-style table (with branch+values). Re shellstyle: this compiler doesn't enjoy anything other than a trailing wildcard in order to end up with a clause than can be reasonably supported with a simple SQL index.

timbray commented 2 years ago

Hmm. At runtime, nothing exists but the compiled DFA structure starting at the coreMatcher type. The current pattern language and a hypothetical SQL-like expression are probably about equally removed from it. @jsmorph has been working on a pattern store which remembers the patterns that have been compiled in - there's no architectural reason why such a store couldn't remember a query-language expression.

Quamina's big sister, the implementation inside EventBridge, SNS, etc, recently grew a way to express "OR". My initial reaction is to dislike it but there's plenty of room to discuss there. There's nothing architectural about the compiled automaton that gets in the way of OR.

timbray commented 2 years ago

Just to be clear, my priorities going forward are going to be adding to the set of supported Patterns for Quamina, and improving its efficiency. But if anyone else wanted to design and propose an SQL-like query language I'd be supportive and be inclined to include it.