kedro-org / kedro

Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
https://kedro.org
Apache License 2.0
9.48k stars 874 forks source link

Improving transcode dataset docs #3833

Closed ElenaKhaustova closed 2 months ago

ElenaKhaustova commented 2 months ago

Description

Solves #3676

Development notes

Developer Certificate of Origin

We need all contributions to comply with the Developer Certificate of Origin (DCO). All commits must be signed off by including a Signed-off-by line in the commit message. See our wiki for guidance.

If your PR is blocked due to unsigned commits, then you must follow the instructions under "Rebase the branch" on the GitHub Checks page for your PR. This will retroactively add the sign-off to all unsigned commits and allow the DCO check to pass.

Checklist

AhdraMeraliQB commented 2 months ago

See built docs

ElenaKhaustova commented 2 months ago

Leave some comments but I think we should really add validation instead of trying too hard to explain this if possible.

Applied suggested changes, thank you!

As for validation, it's already there. The section How *not* to use transcoding aims to clarify the logic behind transcoding and point to potential issues. Though it's good to know if the explanations are unclear.