Open n1ckl0sk0rtge opened 1 month ago
Another problem related to this mechanism with TraceSymbols:
Problem 4: Depending detection rules of method detection rules:
parameters = dh.generate_parameters(generator=2, key_size=2048) server_private_key = parameters.generate_private_key() # Noncompliant {{GENERATION}}
Here, the rule entry point is
generate_private_key
. This rule is a method detection rule (as there are no parameters), and it has a depending detection rule to look forgenerate_parameters
in the method scope. Indeed, ifgenerate_parameters
is found, we can obtain more information (like the key size). After the detection ofgenerate_private_key
, the engine will initiate calling the depending detection rule inonReceivingNewDetection
, in which the engine will use the TraceSymbolNO_SYMBOL
for the depending detection rule. Indeed, as the depending detection rule is defined on a method, there is no relevant symbol. However, while the depending rule will findgenerate_parameters
, this detection has a TraceSymbolSYMBOL, parameters
. The TraceSymbol filtering will therefore remove this finding as it was expecting a finding withNO_SYMBOL
.
This behaviour seems to be a bug more than a limitation of the TraceSymbol mechanism. If a depending detection rule is added to the method of a rule (and not a parameter), the TraceSymbol filtering should be completely bypassed. It can be done from now on using SYMBOL_IGNORED
instead of NO_SYMBOL
in this case, even though this is not what SYMBOL_IGNORED
was intended for.
In the future, it may be even better to have a way to link these two function calls using parameters
, which is the invoked object of the parent rule and the result variable of the child rule. But this may be hard to implement and use in practice.
Another problem where TraceSymbols work, but where we are too limited by the scope of search:
Problem 5: class variables (with
self
) [example inspired by this code]:class SHA256WithRSA: def __init__(self): self.private_rsa = rsa.generate_private_key(public_exponent=65537, key_size=2048) def sign(self, data: bytes) -> bytes: signature = self.private_rsa.sign( data=data, padding=padding.PKCS1v15(), algorithm=hashes.SHA256() ) return signature
When identifying
rsa.generate_private_key
and looking for the dependingsign
detection rule, we define the TraceSymbolSYMBOL, private_rsa
. This is then indeed the symbol on which will be called thesign
function from the library. However, the scope search for thesign
function will be limited to the__init__
function, sosign
will not be detected. The same example/problem should exist in Java with class attributes, how is it handled?
Context
When writing a rule using the
DetectionRuleBuilder
, one can use theshouldBeDetectedAs
function to resolve directly a specific parameter, and one can use theaddDependingDetectionRules
function to resolve a parameter with another rule (for example when this method is a method invocation, using a depending detection rule allows to resolve a specific parameter of this function call). This issue is about the behaviour of the static analysis when resolving the following case (written here in Python, but the issue is also valid for Java):To do so, one should write a rule R1 to detect
func1
, with a depending detection rule R2 on its parameter to detectfunc2
. Then R2 resolves the content of the parameter offunc2
, here "RSA", usingshouldBeDetectedAs
. Currently (at main 0cd69ef79984b2c91bd0e06b07fb5488a2ea6473 and feature/python-support f5cdfe445fd4e5ea87ba6d0e886b84f02ad0c7cd), this type of resolution implies a complex mechanism that is quite limited.Current mechanism
The detection process starts by visiting all method invocations (and class instantiations). For each invocation, a detection executive is started for each registered "entry" rule: R1 in our case. We will therefore look for all functions
func1
in the current scope. Once we have a match, the functionanalyseExpression
is called, and there are several cases:DetectableParameter
(coming fromshouldBeDetectedAs
), meaning that we want to resolve a parameter of the detected function call, we name the subsequent handling logic case of a detectable parameter.DetectableParameter
despite having parameters, it is probably a rule written for an intermediary function that does directly not carry information, but may have depending detections rules. We name the subsequent handling logic case of an intermediary parameter.In our example,
func1
brings not information other than through the content of its parameterfunc2("RSA")
. R1 is written as a rule with a depending detection rule R2 on its parameter, but without ashouldBeDetectedAs
part. We therefore enter the case of an intermediary parameter in the functionanalyseExpression
.Case of an intermediary parameter
In this case, depending on the implementation, we have the choice to call
onDetectedDependingParameter
with two possible scopes:EXPRESSION
orENCLOSED_METHOD
. Basically, it will resolve the depending detection rule(s) by visiting either the parameter tree (func2("RSA")
in our case) or the enclosing method of this parameter (the function in which this expression is).In our example, we want to resolve this particular
func2("RSA")
call, so using the EXPRESSION scope makes sense.func2("RSA")
will be the only match of rule R2, which will correctly resolve the parameter content "RSA". This resolution is will go through the case of a detectable parameter explained below. This example worked as expected!Case of a detectable parameter
Let's look at a new example:
In this example,
func1
still does not bring information other than through the content of its parametervar = ec.ECDSA("SHA_256")
. R1' (to detectfunc1
) is written as a rule with a depending detection rule R2' for its parameter, but has now also ashouldBeDetectedAs
part aiming at resolving the value "ECDSA" (name of the function used as parameter offunc1
). R2' (to detectec.ECDSA
) (withshouldBeDetectedAs
), resolves the content of the parameter ofec.ECDSA
, here "SHA_256", usingshouldBeDetectedAs
. Upon detection offunc1
with rule R1', because the parameter offunc1
is a detectable parameter, we enter the case of a detectable parameter.Let's say that resolution works as expected in this case ("ECDSA" has been detected). Then,
analyseExpression
will callonReceivingNewDetection
, that will call the following rules (here R2') withfollowNextRules
. For now, there is no choice of scope, so it is necessarily the scope of the enclosing method. This means that it will look for a call ofECDSA
in the entire scope of the method: how to make sure that this call is linked to our parametervar
offunc1
? To do so, the current implementation uses a TraceSymbol.onReceivingNewDetection
will obtain the symbol of the parameter offunc1
, hereSYMBOL, var
, and will pass it as an argument tofollowNextRules
.Then, all function calls in the scope of the enclosing method will be matched against R2', which will detect the two
ECDSA
function calls, one assigned tovar
and the other toother_var
. Using the previous TraceSymbol, we filter the findings to only keep the finding linked tovar
. In our case, this works and this example will correctly resolve "SHA_256".A better approach
Could a better and simpler approach be found and replace this complex mechanism (currently based on limited TraceSymbols and varying scopes of resolution)?