Open aleksandr-zakshevskii-n26 opened 6 months ago
Hi @aleksandr-zakshevskii-n26,
I haven't had the chance to read through your PR thoroughly yet, but it looks quite interesting. I wanted to let you know that we are working on a different matching language to make syntax matching easier for most use cases. As you mentioned, tree-sitter queries can be quite challenging, even for straightforward tasks.
With concrete syntax matching, what you see is what you match. For example, the code snippet:
featureService.isEnabled(FeatureFlagWrapper(FeatureFlags.STALE_FEATURE_FLAG))
can be matched straightforwardly with concrete syntax like this:
cs :[var].isEnabled(:[wrapper](:[your_stale_flag]))
If you're interested in learning more, please check out this PR. Additionally, we have some documentation on concrete syntax, although it is somewhat outdated and not comprehensive. You can find it here.
Note however this matching mode is experimental.
@danieltrt thanks for the pointers.
Any recommendations on the following scenarios?
val ff = FeatureFlagWrapper(FeatureFlags.STALE_FEATURE_FLAG)
featureService.isEnabled(ff)
// file 1
fun doSomething(val args: Args, val ff: FeatureFlagWrapper) {
if (featureService.isEnabled(ff)) {
// do something
} else {
// do something else
}
}
// file 2
val ff = FeatureFlagWrapper(FeatureFlags.STALE_FEATURE_FLAG)
doSomething(ff)
Hi @mrhooray !
For the first case, if your feature flag reference and the call to isEnabled
are within the same scope, you can still get cleanups. For example:
from polyglot_piranha import execute_piranha, PiranhaArguments, Rule, RuleGraph, OutgoingEdges
# Original code snippet
code = """
void some_method() {
var ff = FeatureFlagWrapper(FeatureFlags.STALE_FEATURE_FLAG);
if (featureService.isEnabled(ff) || x > 0) {
// do something
} else {
// do something else!
}
}
"""
# Define the rule to replace the method call
r1 = Rule(
name="find_flag_reference",
query="cs var :[reference] = FeatureFlagWrapper(FeatureFlags.STALE_FEATURE_FLAG);",
replace_node="*",
replace="",
)
r2 = Rule(
name="propagate_changes",
query="cs :[service].isEnabled(:[reference])", # Getting the reference from r1
replace_node="*",
replace="true",
holes ={"reference"},
is_seed_rule=False
)
# Define the edges for the rule graph.
# In this case, boolean_literal_cleanup is already defined for java [see src/cleanup_rules]
edge1 = OutgoingEdges("find_flag_reference", to=["propagate_changes"], scope="Method")
edge2 = OutgoingEdges("propagate_changes", to=["boolean_literal_cleanup"], scope="parent")
# Create Piranha arguments
piranha_arguments = PiranhaArguments(
code_snippet=code,
language="java",
rule_graph=RuleGraph(rules=[r1, r2], edges=[edge1, edge2])
)
# Execute Piranha and print the transformed code
piranha_summary = execute_piranha(piranha_arguments)
print(piranha_summary[0].content)
For the second case, I'm not sure. If doSomething
is a method that can take any FeatureFlagWrapper, cleaning up might be tricky since we can't be sure what flag is passed. Nor if it is safe to do it:
val ff = FeatureFlagWrapper(FeatureFlags.STALE_FEATURE_FLAG)
doSomething(ff)
val ff2 = FeatureFlagWrapper(FeatureFlags.NOT_A_STALE_FEATURE_FLAG)
doSomething(ff2)
Thanks for the response and the CST example, @danieltrt
During an execution session, Piranha maintains a symbol table with all the captured nodes, allowing you to reuse their values in subsequent rule applications. To do this, you need to declare them as "holes" in the rule (holes = {"reference"}
)
If you're sure the captured value is only used once in the codebase, you can create a rule for inlining. You just need to find the method declaration first:
find_flag_reference = Rule(
name="find_flag_reference",
query="cs var :[reference] = FeatureFlagWrapper(FeatureFlags.STALE_FEATURE_FLAG);",
replace_node="*",
replace=""
)
find_usage = Rule(
name="find_usage",
# This finds doSomething(ff)
query="cs :[method_name](:[reference])",
holes={"reference"},
replace_node="*",
replace=""
)
find_method_decl_with_usage = Rule(
name="find_method_decl_with_usage",
# This finds doSomething's declaration. This is a match only rule, no replacement
query="cs fun :[method_name](val :[wrapper]: FeatureFlagWrapper) { :[body+] }",
holes={"method_name"},
)
cleanup_actual_usages = Rule(
name="cleanup_actual_usages",
# Regular cleanup
query="cs :[_].isEnabled(:[wrapper])",
replace_node="*",
replace="true",
holes={"wrapper"},
is_seed_rule=False
)
edge1 = OutgoingEdges("find_flag_reference", to=["find_usage"], scope="Method")
# Specify you want to search in your entire code base, to find the method declaration
edge2 = OutgoingEdges("find_usage", to=["find_method_decl_with_usage"], scope="Global")
# Find the actual usages of is_enabled within the method declaration from above
edge3 = OutgoingEdges("find_method_decl_with_usage", to=["cleanup_actual_usages"], scope="Method")
edge4 = OutgoingEdges("cleanup_actual_usages", to=["boolean_literal_cleanup"], scope="Parent")
Sorry for the late response. Thank you very much for your responses, I will finalize my approach and will get back to you. Best, Alex
Hello team đź‘‹,
I wanted to get your help to validate my approach for cleaning up obsolete feature flags using your amazing Piranha framework.
Background
We work with various
Kotlin
codebases where the handling of feature flags may vary by team. My goal was to develop a set of universal Piranha rules to cover all possible feature flag invocations. We are usingPiranha Polyglot
version.Current Challenge
Here is one of the examples of how feature flags could be used:
In the example above, a feature flag is a parameter to an enumeration item. However, feature flags can also be stored as an element of a sealed class or as a simple string. I found that the more patterns I want to capture, the more challenging it becomes to maintain the rules and edges.
Problems Faced
1. Complexity of Tree-Sitter Queries
Here is an example of tree-sitter query to capture an invocation call of a feature flag:
The corresponding
tree-sitter
query to match the feature flag:In case if a feature flag is wrapped with a constructor call, as follows:
The tree-sitter query becomes significantly more complex for visual perception.
2. Redundancy in Queries:
Let's consider two examples:
the corresponding tree-sitter query will be:
and
the corresponding tree-sitter query will be:
As can be seen from the
tree-sitter
queries above - both queries differ only by a identifier part, but the enumeration part is exactly the same.3. Repetition in flows between rules
The more rules it becomes - I noticed the same repetition patterns between them. For example if a flag is defined in a sealed class the possible flow of the rules could be the following
However flag could be also defined using an enum, and the flow will be exactly the same.
Idea
My idea is to create a script that uses rule templates and graph-encoded relationships to generate
rules.toml
andedges.toml
files. This approach aims to:Example
Here is an example of a feature flag invocation:
is_flag_enabled
boolean functionAs the result - it would be expected that:
Representing the edges as directed graph
The graph represents rule templates and their relationships, where paths from starting nodes to leaves represent rule flows
The green nodes represent the starting points used for the initial feature flag extraction. Each path in the graph indicates a sequence of rule applications. These paths are constructed recursively from the starting node to the leaves. For example, for the path
A -> B -> C
, the following edges will be generated:To demonstrate this approach I've implemented a simple python script and configurations. Python script
The script is a proof of concept rather than a finished solution. It takes 2 configuration files as parameters:
rule_templates.toml
- contains tree-sitter rule templatesedges_templates.toml
- contains the directed graph, which stores the rules as nodes and relations between them, it is used to generate fileedges.toml
This script produces the following files as the result:
rules.toml
- contains auto-generated rules which then applied to the tested filesedges.toml
- contains auto-generated edgesYou can find all relevant files along with the tests in this PR
I would greatly appreciate your feedback and thoughts on this approach. Thank you for your time and consideration.