Open mwermelinger opened 1 year ago
Is the PEP 8 style guide the ideal we should be aiming for in this project? Should this be the document I turn to when considering all things style related? I noticed that a couple of doc-strings etc have been modified and I would like to get them right the first time, does this document guide your choices for that kind of thing?
PEP 8 is for code and PEP 257 is about docstrings. The best is to use a tool like pylint or flake8 to check the style. Can't remember if either of them automatically uses pydocstyle.
Thanks I will certainly start using pylint
etc. I guess I wanted to try to read about the appropriate style so I can try to get it right the first time and start building good habits! There seems to be two major style guides for Python
: one is PEP and the other seems to be the google style guide.
I have been refactoring check_tree
and will just post the new code here before PR to see if you would be happy with the style (it still needs a bit of work with doc strings and type annotations). Would you be happy with the nested functions? or would you prefer separate functions with more arguments passed? or another option would be to make it into a class? And it does contain lamdas, but IMHO this time they are a bit more readable, they can be taken out but more functions would need to be added. Anyway just thought I would see what you think (if I am on the right path) before doing anymore work on it.
def check_tree(
tree: ast.AST,
constructs: tuple,
source: list,
line_cell_map: list,
errors: list
) -> None:
"""Check if tree only uses allowed constructs. Add violations to errors"""
language, options, imports, functions, methods = constructs
def location(line: int, line_cell_map: list) -> tuple[int, int]:
"""Return (0, line) if not a notebook, otherwise (cell, relative line)."""
return line_cell_map[line] if line_cell_map else (0, line)
def add_error(node: ast.AST, message: str) -> None:
"""add (cell, line, message) to errors"""
cell, line = location(node.lineno, line_cell_map)
errors.append((cell, line, message))
def handle_op(node, ops) -> None:
for op in ops:
if not isinstance(op, language):
message = CONCRETE.get(type(op), "unknown operator")
add_error(node, message)
def handle_import(node) -> None:
for alias in node.names:
if alias.name not in imports:
message = f"import {alias.name}"
add_error(node, message)
def handle_import_from(node) -> None:
if node.module not in imports:
message = f"from {node.module} import ..."
add_error(node, message)
else:
for alias in node.names:
if alias.name not in imports[node.module]:
message = f"from {node.module} import {alias.name}"
add_error(alias, message)
def handle_attribute(node) -> None:
if isinstance(node.value, ast.Name):
name = node.value
attribute = node.attr
if name.id in imports and attribute not in imports[name.id]:
message = f"{name.id}.{attribute}"
add_error(node, message)
elif hasattr(name, "resolved_annotation"):
type_name = re.match(r"[a-zA-Z.]*", name.resolved_annotation).group()
if type_name in methods and attribute not in methods[type_name]:
message = f"method {attribute} called on {type_name.lower()} {name.id}"
add_error(node, message)
def handle_call(node) -> None:
if isinstance(node.func, ast.Name):
function = node.func.id
if function in BUILTINS and function not in functions:
message = f"built-in function {function}()"
add_error(node, message)
def handle_loop_else(node, loop_type: str) -> None:
if node.orelse and loop_type not in options:
message = f"else in {loop_type}-loop"
add_error(node.orelse[0], message)
node_handlers = {
ast.BinOp: lambda node: handle_op(node, [node.op]),
ast.UnaryOp: lambda node: handle_op(node, [node.op]),
ast.BoolOp: lambda node: handle_op(node, node.values),
ast.Compare: lambda node: handle_op(node, node.ops),
ast.Import: handle_import,
ast.ImportFrom: handle_import_from,
ast.Attribute: handle_attribute,
ast.Call: handle_call,
ast.For: lambda node: handle_loop_else(node, "for"),
ast.While: lambda node: handle_loop_else(node, "while"),
}
for node in ast.walk(tree):
if isinstance(node, NO_LINE):
continue
handler = node_handlers.get(type(node))
if handler:
handler(node)
elif not isinstance(node, language):
if hasattr(node, "lineno"):
message = source[node.lineno - 1].strip()
else:
message = f"unknown construct {str(node)} at unknown line"
add_error(node, message)
Thanks for this. Internal functions have many global variables that can trip one up when reading the code. External functions will have more arguments. Maybe a class is the way to go with one public method check
and several private methods _check_op
etc. The location
and add_error
methods can probably be merged into a single private one.
I think your else loop handling needs fixing, because the OPTIONS have "for else" and "while else".
The main loop could become
if isinstance(node, NO_LINE):
pass
elif handler := node_handlers.get(type(node)):
handler(node)
elif not ...
However, I wonder if this will be too many changes for just removing two pylint messages, i.e. low ROI. Maybe better to do all the refactoring after the tool can be installed with pip and it doesn't matter whether the tool is made of 1 or 10 files.
Good spot with handle_loop_else
probably could be fixed with something simple like
def handle_loop_else(node, loop_type: str) -> None:
if node.orelse and loop_type + " else" not in options:
message = f"else in {loop_type}-loop"
add_error(node.orelse[0], message)
Yes, waiting until #11 is completed is probably the best course of action now you mention it. I am assuming that we can also remove a lot of the import conditionals once pip is up and running. In a way it is low ROI, but the returns I am getting in education are quite high :smile:
Even though
pylint
's score is > 9, it reports several issues: reusing identifiers in different scopes andcheck_tree
has too many statements, benches and local variables. This will require breaking up the check into different functions.