github / semantic

Parsing, analyzing, and comparing source code across many languages
8.94k stars 453 forks source link

Extract variables form a python code #674

Closed raffaelepojer closed 2 years ago

raffaelepojer commented 2 years ago

Hi all! I'm currently trying to extract all variables and parameters arguments from a python code with semantic.

Ex: given some python code, I need to extract variable1, variable2, arg1 and arg2

def function(self, arg1, arg2):
    variable1 = something... 
    variable2 = something... 

By looking at some examples with ast of Python, the code for doing that can be something like that:

root = AstLib.parse(code)
for node in AstLib.walk(root):
    if (isinstance(node, AstLib.Name) and isinstance(node.ctx, AstLib.Store)):
        varList.append(node.id)
    elif isinstance(node, AstLib.FunctionDef):
        for arg in node.args.args:
            varList.append(arg.arg)

Does anyone know how to do it by looking at the graph produced by semantic? Thanks

patrickt commented 2 years ago

Hi there. Right now Semantic’s tree structure is a bit user-unfriendly (since we’re designing an alternate representation, we haven’t sanded all the rough edges off the current one). If possible, I would recommend checking out tree-sitter and its associated query language. Such an operation should be a fairly simple query; you can find the docs here: https://tree-sitter.github.io/tree-sitter/using-parsers#pattern-matching-with-queries

The way to do it in Haskell would be to use uniplate and syb to write a generic top-down function, and use the cast operation from Typeable for your isinstance checks. I think we should keep this open and provide Plated instances for the types, at least until we integrate the higher-kinded approach and (possibly?) recover some way of doing recursion schemes. But for your case, I definitely advise seeing if you can use the tree-sitter query language.

raffaelepojer commented 2 years ago

Thanks @patrickt for the quick response and the detailed answer, I will check tree-sitter