kamahen / pykythe

Generate code Python source cross-reference facts in Kythe format
Other
21 stars 3 forks source link

Maybe add SSA analysis #20

Open kamahen opened 5 years ago

kamahen commented 5 years ago

Currently, a local variable is give a single FQN (fully qualified name) throughout an entire function. This leads to some bad inferences, such as here (reduced from lib2to3.fixer_util.Assign):

def Assign(source):
    if not isinstance(source, list):
        source.prefix = " "
        source = [source]
    return Node(syms.atom, [Leaf(token.EQUAL)] + source)

In this particular case, the analysis would be better if the type annotations were precise:

def Assign(target: Union[Node,Leaf], List[Union[Node,Leaf]]) -> Node:

Alternatively, it would help if pykythe used isinstance in its analysis. But in general, multiple bindings of a local variable are problematic.

SSA uses the Φ (Phi) function to combine the results of multiple branches, but there's nowhere to anchor the Φ assignment:

def Assign(source):
    if not isinstance(source, list):
        source.prefix = " "
        source2 = [source]
    source3 = Φ(source, source2)
    return Node(syms.atom, [Leaf(token.EQUAL)] + source3)

It seems that this requires a change to the Kythe data model, to allow a ref to be to multiple places:

#- @source defines/binding Source_param
def Assign(source):
    #- @source ref Source_param
    if not isinstance(source, list):
        #- @source ref Source_param
        source.prefix = " "
        #- @#0source defines/binding Source2
        #- @#1source ref Source_param
        source = [source]
    #- @source ref Source_param
    #- @source ref Source2
    return Node(syms.atom, [Leaf(token.EQUAL)] + source)

This discussion probably doesn't apply to class/instance variables or to global (module) variables; they can be changed anywhere, so changing their type is usually bad style (the one exception being the use of None for uninitialized; but that's captured by the Optional[...] type annotation).