github / codeql

CodeQL: the libraries and queries that power security researchers around the world, as well as code scanning in GitHub Advanced Security
https://codeql.github.com
MIT License
7.71k stars 1.55k forks source link

Python: Local/Global dataflow analysis not tracing class field? #17021

Open hksdpc255 opened 4 months ago

hksdpc255 commented 4 months ago

Python

class Cls:
    def __init__(self) -> None:
        self.field = 1
    def __init__(self, num) -> None:
        self.field = num
    def print(self) -> None:
        print(self.field)

if __name__ == '__main__':
    var1 = Cls(2)
    var2 = var1
    var2.field = 3
    var1.print()
    var1.field2 = 4
    print(var2.field2)

CodeQL

import python
import semmle.python.ApiGraphs
import semmle.python.dataflow.new.DataFlow
import semmle.python.dataflow.new.TaintTracking

module MyConf implements DataFlow::ConfigSig {
    predicate isSource(DataFlow::Node source) {
        source.asExpr() instanceof IntegerLiteral
    }
    predicate isSink(DataFlow::Node sink) {
        sink = API::builtin("print").getACall().getArg(0)
    }
}

module MyFlow = DataFlow::Global<MyConf>;

from DataFlow::Node source, DataFlow::Node sink
where MyFlow::flow(source, sink)
select source, sink

Output

source sink
1 self.field in line 7
2 self.field in line 7

Expected result

source sink
1 self.field in line 7
2 self.field in line 7
3 self.field in line 7
4 self.field in line 16
aibaars commented 4 months ago

Perhaps the problem is that CodeQL does not "see" that var1 and var2 are references to the same object. What happens if you don't write var1 = var2 and use var1 in all the places where it says var2 ?

hksdpc255 commented 4 months ago

That's the problem. CodeQL does not "see" that var1 and var2 are references to the same object.

hksdpc255 commented 3 months ago

So, is there any plan to fix this bug?