Open Oidaho opened 2 weeks ago
Right now, your node is some kind of very static class that cannot be transformed.
Eliminate the list of types implemented via Enum
and create your own class for each possible node:
class ASTNode:
"""The base class for all AST nodes."""
pass
class Program(ASTNode):
"""Acts as the root of AST."""
def __init__(self, statements):
self.statements = statements # List of instructions
class Comment(ASTNode):
def __init__(self, text):
self.text = text
class Assignment(ASTNode):
def __init__(self, target, operator, value):
self.target = target # Target identifier
self.operator = operator # '=', '+=' or smt else
self.value = value # Statement value
class Identifier(ASTNode):
def __init__(self, name):
self.name = name
class Literal(ASTNode):
def __init__(self, value):
self.value = value
# And so on....
After all, it is much easier to implement the construction of an AST based on a ready-made sequence of tokens.
I suggest implementing the add_token()
method in the ASTBuilder
class, which will only add tokens to the internal buffer of the class.
After that, implement the build()
method, which will return a ready-made syntax tree, converting the list of tokens in the buffer into a queue using collections.deque
.
I see it like this:
ast_builder = ASTBuilder()
for token in ... :
ast_builder.add_token(token)
ast: ASTNode = ast_builder.build()
In addition, open Internet sources offer something like this implementation option:
def build():
token_deque = deque(self.tokens)
statements = []
while token_deque:
token = token_deque.popleft()
if token['type'] == 'comment':
# Comment Node
statements.append(Comment(token['value']))
elif token['type'] == 'identifier':
# Assignment node
target = Identifier(token['value'])
tokens.popleft() # Skipping whitespace
operator = tokens.popleft()['value'] # '=', '+='
tokens.popleft() # Skipping whitespace
value_token = tokens.popleft()
if value_token['type'] == 'integer':
value = Literal(int(value_token['value']))
else:
raise SyntaxError("The literal was expected after the operator")
statements.append(Assignment(target, operator, value))
elif token['type'] == 'newline':
continue # Skipping newline
else:
raise SyntaxError(f"Unexpected token: {token}")
return Program(statements)
This solution looks concise. But this is not exactly what we need. Try experimenting with this code and create something similar based on it.
THIS IS NOT A TASK!!!
In addition to this, there is a similar solution for generating C++ code based on AST nodes:
def generate_cpp_code(ast):
if isinstance(ast, Program):
return "\n".join(generate_cpp_code(stmt) for stmt in ast.statements)
elif isinstance(ast, Comment):
return f"// {ast.text}"
elif isinstance(ast, Assignment):
target = ast.target.name
operator = ast.operator
value = generate_cpp_code(ast.value)
if operator == '=':
return f"{target} = {value};"
elif operator == '+=':
return f"{target} += {value};"
else:
raise NotImplementedError(f"Оператор {operator} не поддерживается")
elif isinstance(ast, Identifier):
return ast.name
elif isinstance(ast, Literal):
return str(ast.value)
else:
raise NotImplementedError(f"Тип узла {type(ast)} не поддерживается")
This code is not any good, in my opinion. However, it can be used as a reference.
Recommendations for the task completion process
Added some files (#1)
pytest
tests for your task passWrite Syntax analyzer
Write a code parser that will make up an Abstract Syntax Tree (AST).
List of implementation tasks
test_code.py:
Console output: