kedro-org / kedro

Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
https://kedro.org
Apache License 2.0
9.48k stars 874 forks source link

Curate plugins by maintenance activity in Kedro's documentation #2291

Open idanov opened 1 year ago

idanov commented 1 year ago

Description

Currently Kedro lists a number of community plugins in its documentation (https://kedro.readthedocs.io/en/stable/extend_kedro/plugins.html) , but it's hard to assess how actively maintained those plugins are currently. It would be really useful for the users to have a badge against each plugin on that list with the activity of that plugin, e.g. either commits per month or last commit activity badges.

Context

Why is this change important to you? How would you use it? How can it benefit other users?

Possible Implementation

We can use any of these badges: https://shields.io/category/activity

Last commit Plugin
GitHub last commit kedro-pandas-profiling
GitHub last commit kedro-mlflow
GitHub last commit find-kedro
GitHub last commit kedro-kubeflow
GitHub last commit kedro-airflow-k8s
GitHub last commit kedro-vertexai
GitHub last commit kedro-azureml

And so on...

Additionally, we could ensure the links open in a new tab, since currently they open in the same tab of the documentation.

stichbury commented 1 year ago

This is a nice idea, but I'm not sure we can open in a new tab that easily in Sphinx, nor how we'd include these as a table (except perhaps by embedding as HTML within the markdown). It's easy enough to experiment though.

Perhaps it would be preferable to ask active plugin owners to put the badge in their readme and for any that do not, we remove from the list of plugins in a known state and have a separate list of "Use at your peril" plugins in the docs.

merelcht commented 1 year ago

The idea of badges is nice, but it's not clear how this would actually work inside the Sphinx docs. As an alternative solutions we'll go through all the plugins and divide them into "actively maintained" and "unknown state". Presumably users would create a PR if their plugin is in the wrong category.

It's not ideal that this division needs to be maintained manually, but at least it's better than not having the categories at all.

astrojuanlu commented 1 year ago

Prior art: https://myst-parser.readthedocs.io/en/latest/intro.html (link to https://sphinx-extensions.readthedocs.io/en/latest/)

image

Related: https://github.com/kedro-org/kedro-devrel/issues/40

astrojuanlu commented 1 year ago

cc @inigohidalgo

astrojuanlu commented 1 year ago

Other prior art: Astropy affiliated packages https://github.com/astropy/astropy-project/blob/main/affiliated/affiliated_package_review_guidelines.md (see an old example https://github.com/poliastro/poliastro/issues/279).

image

astrojuanlu commented 1 year ago

In summary: I think it's not enough that we put a colored badge depending on activity. Users have showed concerns about whether plugins not under kedro-org are worth adopting (see example), and we should draft a strategy to address those concerns.

This is a huge topic that would deserve some discussion, but off the top of my head and just to start scratching the surface, one thing that could be more valuable than "days since last commit" would be: "does this plugin actually work with the latest versions of Kedro"? Hence providing a compatibility matrix. Something similar to what the Numba folks started here https://github.com/numba/numba-integration-testing

astrojuanlu commented 1 year ago

Prior art: https://intake.readthedocs.io/en/latest/plugin-directory.html

image

noklam commented 1 year ago

Tangentially related kedro-org/kedro-plugins#535