kedro-org / kedro

Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
https://kedro.org
Apache License 2.0
9.88k stars 895 forks source link

Update Airflow Astro deployment docs #3792

Closed DimedS closed 5 months ago

DimedS commented 5 months ago

Description

This PR addresses issues 1, 2, 3, 5, and 6 from 605:

  1. The architecture of the Astro deployment manual has been restructured into two stages: preparation of the Kedro project and Astro deployment. Redundancies have been eliminated, and the manual is now updated and fully operational.
  2. The kedro-airflow-k8s plugin has been relocated to the end of the document because it is compatible only with Kedro versions earlier than 0.17.
  3. The astro-airflow-iris starter has been replaced with the standard spaceflights-pandas example pipeline.
  4. Customisation of logging has been introduced.
  5. Instructions for transferring results back from the container have been added.

Developer Certificate of Origin

We need all contributions to comply with the Developer Certificate of Origin (DCO). All commits must be signed off by including a Signed-off-by line in the commit message. See our wiki for guidance.

If your PR is blocked due to unsigned commits, then you must follow the instructions under "Rebase the branch" on the GitHub Checks page for your PR. This will retroactively add the sign-off to all unsigned commits and allow the DCO check to pass.

Checklist

DimedS commented 5 months ago

This is a fantastic start @DimedS ⭐ I have left some comments inline, but in general I think it would be useful to add some more context and explanations about why certain steps are needed. Also a minor note on referencing Kedro class names, when talking about a class in Kedro e.g. DataCatalog the best practice would be to use the exact class name or the regular english name for it e.g. data catalog, but not a hybrid like Data Catalog.

Thank you, @merelcht ! I agree and hope I've addressed your comments.

astrojuanlu commented 5 months ago

Woops

sphinx.errors.SphinxWarning: /home/docs/checkouts/readthedocs.org/user_builds/kedro/checkouts/3792/docs/source/deployment/index.md:48:toctree contains reference to nonexisting document 'deployment/airflow_astronomer'
DimedS commented 5 months ago

I believe I've addressed most of the comments. Could you please do a final check, @ankatiyar , @merelcht , @astrojuanlu ?

astrojuanlu commented 5 months ago

@sbrugman By any chance do you have a moment to give this a look?