Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
Came up with a random idea today, wanted to run by you.
Would a Kedro-related plugin for flake8 or ruff be a thing in your opinion? And have you heard of something like that existing?
Some ideas for rulesets to check are:
No class can be defined in file called nodes.py
_private functions cannot be wrapped in node constructor
Prohibit repeated sequences of e.g. 7+ nodes - recommend leveraging modular pipeline instead.
No duplicated catalog entries (can this linter combine YAML and Python rules...?)
No catalog entries where you have like 5 long instances that differ in just few characters - encourage Dataset Factories for that.
Watch for credentials leakage in parameters yamls (for this we probably import a rule from existing tool, likely it exists already)
Context
I think this can simplify collaboration in large projects and help encourage proper usage of Kedro concepts.
Possible Implementation
Check how other plugins are made and implement something similar. There are tens of rulesets defined for ruff and hundreds (if not thousands?) plugins for flake8, I'm sure there are good reference examples.
For example, this is a flake8 plugin that implements a ton of custom rules.
Description
Came up with a random idea today, wanted to run by you.
Would a Kedro-related plugin for
flake8
orruff
be a thing in your opinion? And have you heard of something like that existing?Some ideas for rulesets to check are:
class
can be defined in file callednodes.py
_private
functions cannot be wrapped innode
constructornodes
- recommend leveraging modularpipeline
instead.catalog
entries (can this linter combine YAML and Python rules...?)Dataset Factories
for that.parameters
yamls (for this we probably import a rule from existing tool, likely it exists already)Context
I think this can simplify collaboration in large projects and help encourage proper usage of Kedro concepts.
Possible Implementation
Check how other plugins are made and implement something similar. There are tens of rulesets defined for
ruff
and hundreds (if not thousands?) plugins forflake8
, I'm sure there are good reference examples.Possible Alternatives
Do not do it :)