Closed merelcht closed 9 months ago
@AntonyMilneQB Raised a good question about whether the pre-commit hooks actually still work without the file being at the root of the repo.
Having .pre-commit-config.yaml
at the root of the repo is also recognizable by other tools, such as pre-commit.ci (which we could consider using).
There's no great answer for this, but I've seen https://github.com/pre-commit/pre-commit/issues/466#issuecomment-274282684. Keep in mind that the pre-commit author also says that it's not really designed for monorepos (although it can be made to work).
@noklam raised a good point:
The main drawback is if you are working on the smaller repositories like kedro-airflow , you still have to run the linting for the full datasets repo which takes more time
Maybe it makes sense to just use something like tox to manage the way you run tests in something like this?
In order to install the Kedro dependency correctly, we need to add pip install -r requirements.txt
under install-test-requirements
in the case that $plugin
is equal to kedro-datasets
in the kedro-plugins Makefile
.
I would be open to putting together a PoC with tox
, if you (others) are interested @jmholzer.
The below breakdown of challenges and solutions aims to:
kedro-plugins
as a lightweight testbed for tooling for better practices/other improvements, rather than on the more complex main kedro
repo (secondary motivation)pre-commit
configuration is maintained per-plugin
pre-commit
doesn't have first-class monorepo support (e.g. support hierarchical config); see the author's guidance on monorepo supportkedro_plugins
conda environment for every plugin in CI is not reasonably transferrable to local environment[kedro-docker]
on PR/issues (nonstandard solution)tox
per-plugin to allow testing a plugin across build environments using a single command (tox
) -- rough estimate: 2 days, risks: conda-installed packages may need to be handled differently, accessing pre-commit config above plugin root is less cleantox4
for now until more confident about broader plugin support (e.g. tox-conda
)tox
-based checks in CIcommitlint
-- rough estimate: trivial, <0.5 dayspython-semantic-release
, or similarruff
, which does support hierarchical config -- rough estimate: 1-2 days, including set up with pre-commit
flit
, hatch
, poetry
, pdm
) to cover more steps (e.g. requirements management, release, etc.) through a single tool -- rough estimate: 1-2 days for flit
, not sure for the restAgree with the challenges but I would order them in reverse order.
It's not clear which changes affect which plugins
[kedro-docker]
on PR/issues (nonstandard solution)
This creates overhead for maintenance, particularly since the changes are quite sparse (not everyone is familiar with all the plugins, usually one person touch a small part of it). It's quite important to isolate the changes source of error.
Makefile doesn't support local testing of individual plugins in isolated environments
kedro_plugins
conda environment for every plugin in CI is not reasonably transferrable to local environmentAs mentioned, it's not documented and I have a separate conda environment for development. This probably affects us as developers more as most contributors will only touch one of the plugin? Nevertheless, it's great to have good practices
pre-commit
configuration is maintained per-plugin
pre-commit
doesn't have first-class monorepo support (e.g. support hierarchical config); see the author's guidance on monorepo support
I agree this could be an improvement, but I am not sure whether this is the top priority. Most of the configurations are only set once from the beginning and we rarely touch the configuration. It definitely violates the DRY principle, but I don't find the hierarchical config helps a lot.
I would add one more challenge:
kedro
we have frequent changes so any dependencies change will be found out pretty quickly. For plugins, there could be a dependency change but we only have new commits after a few months. In this case, it is unclear whether the new changes break the CI or something in between happens, often it takes a long time to isolate the error.As discussed in technical design on 15/2:
One additional note to keep in mind that I just remembered; we need to address the behavior when multiple plugins are modified (or just make sure this doesn't happen in the future approach):
Potentially relevant for the automated release https://blog.pypi.org/posts/2023-04-20-introducing-trusted-publishers/
I think most of https://github.com/kedro-org/kedro/issues/2333#issuecomment-1434492921 has already been done, except for https://github.com/kedro-org/kedro-plugins/issues/512
Closing this because the CI/CD of kedro-plugins
has been hugely improved already. I created a new milestone for testing the plugins individually: https://github.com/kedro-org/kedro/milestone/60
Description
Following up on https://github.com/kedro-org/kedro-plugins/pull/4
Currently
kedro-plugins
has one top levelMakefile
, each plugin has it's own.pre-commit-config.yaml
andpyproject.toml
. Look at whether it would be better for these configuration files to be unified or for each plugin to have it's own files.@AntonyMilneQB Raised a good question about whether the pre-commit hooks actually still work without the file being at the root of the repo.
For this issue come up with a plan to improve the setup: