Open anirudha opened 2 years ago
what interface functions do we need for this library ? so that visualizations can work with PPL from the config. panel / or drag and drop/
eg. add_field_to_x-axis( field)
Observability dashboard currently leverages regular expressions to match, insert, append, replace and delete query segments to achieve query parsing and further rewriting for use cases like inserting time range from time picker, extracting index pattern from queries, and etc. It solves the problem at the moment however the downsides of this solution are also obvious. One of the downsides is it is costly to maintain and scale regular expressions while supporting many more complex use cases. Also, a regular expression is usually coupled with one or only couple of use cases, therefore with the nature of the complexity of PPL, large amounts of regular expressions have to be created and maintained in order to cover vast majority of use cases and corner cases. Therefore, It’s unrealistic for us to keep with regex solution for features which require query parsing.
On the other hand, Observability visualization provides users with only limited capabilities nowadays to visualize their data, where they have to manually write exact query every single time for rendering a visualization. That usually requires not only solid understanding of the language itself but also how different types of visualizations are visualized through what aggregation queries. Users without enough PPL/visualization knowledge and background often feel lost in Observability visualizations as there is a gap between visualize a visualization that the they want and writing the correct query for a specific type of visualizations.
In order to address the problems stated above and provide effortless visualizing experience, the existing regular expression based solution is replaced with more robust Antlr based solution for query rewrites.
Along with this change, a query manager is introduced to act as a wrapper sitting on top of Antlr solution for managing internal modules, and exposes various interfaces for query parsing/building use cases to consumers.
ANTLR4 is a very popular parser generator in language parsing and recognition world, and is widely used by many individuals and organizations to build languages, toolings and frameworks. Compared with some other alternatives, ANTLR is fully featured and out-of-box with good integration with IDE. Also, search solutions for Observability is built based upon PPL, which also leverages Antlr as its basic building block. Therefore adopting Antlr4 for Dashboard Observability minimizes the effort to support various query related features as well as uniforms our approaches and methodologies for building search related user interfaces.
Overall, query manager exposes interfaces to outside world to support query parsing and building services. The core of the overall query manager is Antlr4ts engine which is composed of a lexer and a parser for processing original query, and further generating a CST.
Internally, query manager consists of two modules which are query parser and builder. Query parser is essentially a visitor that traverses the CST and transform it into an AST. Whereas query builder is the opposite way that takes a set of parsed units and recursively builds a new AST.
Query parser is one of the core modules that parses a query into query parts. It can be further divided into syntax recognizer, grammar parser and AST builder. AST builder generates an AST which contains a list of connected PPL nodes where each corresponds to a partial in the original query.
Currently, the query parser only supports parsing stats command of a query, and any other parts of a query will be treated as they are. Therefore the result of the parsing is essentially a stats AST tree which consists of a number of nodes listed below.
Once it receives this tree structure, query parser invokes getTokens method of the root node, where it also recursively invokes each getTokens of its children to get a finial parsed object.
Interface PPLQueryParsedStats {
aggregations: {
function_alias: string;
function: {
name: string;
value_expression: string;
percentile_agg_function: string;
};
};
groupby: {
group_fields: Array<GroupField>;
span: Span;
};
partitions: AggFlag;
all_num: AggFlag;
delim: AggFlag;
dedup_split_value: AggFlag;
}
Interface GroupField {
name: string;
}
Interface AggFlag {
keyword: string;
sign: string;
value: string;
}
As it’s stated above, query builder is the exact opposite way compared to query parser which takes a recipe object that contains PPL query parts, and recursively builds an aggregation AST subtree.
Interface PPLQueryRecipe {
aggregations: {
function_alias: string;
function: {
name: string;
value_expression: string;
percentile_agg_function: string;
};
};
groupby: {
group_fields: Array<GroupField>;
span: Span;
};
partitions: AggFlag;
all_num: AggFlag;
delim: AggFlag;
dedup_split_value: AggFlag;
}
Once it recursively builds an AST with the recipe, the query builder revokes toString() method in each node, and then recursively composes a new substring of stats and returns it. Query builder usually pairs with query parser, so when it starts to build a new stats subquery, it knows the start and end indices of the stats command partial in original query from parsed results. Therefore it knows where is the inserting/appending positions for the new stats subquery based upon the start/end indices.
Query Manager once initiated will be a singleton instance throughout the observability plugin lifecycle. It exposes two interfaces that once invoked would further initialize query parser/builder instances.
// return new PPLQueryBuilder instance
queryBuilder: () => PPLQueryBuilder;
// return new PPLQueryParser instance
queryParser: () => PPLQueryParser;
// example usage
const qm = new QueryManager();
const qp = qm.queryParser();
const qb = qm.queryBuilder();
/** query parser **/
// parse query to get CST
parse: (pplQuery: string) => PPLQueryParser;
// get AST
getStats: () => PPLStatsTokens;
// example usage
const tokens = qp.parse(query).getStats()
/** query builder **/
build: (query: string, pplStatsRecipe: PPLQueryRecipe) => string;
// example usage
const newQuery = qb.build(query, pplStatsRecipe);
With the capability of composing an aggregation query through configuration UI, a user scenario comes up with it that is a user may modify the stats expression of a query which leads to an inconsistent states between UI and query. In order to not introduce confusions to users, observability visualization supports a two-way query sync on update/re-visualize action where two states
are connected bi-directional. Either one of them changed will change the other (currently only works for aggregations and group by) to make sure they are consistent when update or change action is issued.
[to-do]
[to-do]
[to-do]
Is your feature request related to a problem? We need the UI widgets to re-write and append/extend the PPL stats and other grammar to issue new queries to render visualization changes
What solution would you like? We need a library that can do the re-write operations, preferable with an syntax tree
eg. https://blog.dangl.me/archive/creating-antlr-applications-in-typescript/