kedro-org / kedro

Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
https://kedro.org
Apache License 2.0
9.48k stars 874 forks source link

Create a docs page about adding code beyond starter files #3852

Open yury-fedotov opened 2 months ago

yury-fedotov commented 2 months ago

Description

Closes #3418

Per this discussion in Slack with @noklam, I took a task of creating a guide on how to add code beyond starter files in Kedro.

P.S. It's my first contribution to the project, so I'd be happy to iterate as much as needed until it satisfies core team's needs. And appreciate any feedback.

Development notes

Developer Certificate of Origin

We need all contributions to comply with the Developer Certificate of Origin (DCO). All commits must be signed off by including a Signed-off-by line in the commit message. See our wiki for guidance.

If your PR is blocked due to unsigned commits, then you must follow the instructions under "Rebase the branch" on the GitHub Checks page for your PR. This will retroactively add the sign-off to all unsigned commits and allow the DCO check to pass.

Checklist

yury-fedotov commented 1 month ago

Rendered version: https://kedro--3852.org.readthedocs.build/en/3852/kedro_project_setup/code_beyond_starter_files.html

yury-fedotov commented 1 month ago

@noklam, @astrojuanlu, could you please have a look?

yury-fedotov commented 1 month ago

Hey @merelcht - thanks for comments again, I just finished implementing them.

QQ also about the vale usage in CI. I noticed that it produces some warnings on this PR, but apparently on many other existing MD files. So it's like a soft CI check?

merelcht commented 1 month ago

QQ also about the vale usage in CI. I noticed that it produces some warnings on this PR, but apparently on many other existing MD files. So it's like a soft CI check?

Yes, it's a soft check. I usually treat it as a guidance on grammar/word usage. It does also check for spelling mistakes, which is helpful. But you definitely don't need to address it all.

stichbury commented 1 month ago

I am happy to review/edit this if you decide to retain as docs, but please let me know what you decide about this:

I like a lot of the recommendation here, but I hesitate to make it an official recommendation without the team discussing a bit more. Should we extract the content and make it a guest blog post instead? @astrojuanlu

LMK if you want to have a blog post instead, which sounds like a great idea, and then you could extract from the post if you wanted a longer-term piece of content that you maintain officially as docs. I wouldn't be able to craft the post without some notice and a ticket, but would definitely support the idea and help with later stages to edit and post, if you need it.

Just ping me again when you decide how to go forward!

merelcht commented 1 month ago

Hi @stichbury, we've discussed this in backlog grooming and decided we'd like to get it in as docs. I really appreciate any help you can give getting this in shape πŸ™

stichbury commented 1 month ago

Hi @stichbury, we've discussed this in backlog grooming and decided we'd like to get it in as docs. I really appreciate any help you can give getting this in shape πŸ™

For sure! I'm not sure if I'll be able to get to this in this sprint, but I'll add to my list; otherwise it'll be next week. To save holding you up, probably best to get it into the state where you want me to do a final review/update and let me know it is ready.

yury-fedotov commented 1 month ago

Thanks for the comments team! I'll get to them next week and come back.

stichbury commented 1 month ago

@yury-fedotov @astrojuanlu Can you ping me when this is ready for my review and I'll get you any final feedback on the candidate draft.

astrojuanlu commented 1 month ago

There are a couple of outstanding comments waiting for response from @noklam and after that I think we should be good for a next round of reviews. Thanks for your patience @yury-fedotov πŸ™πŸΌ

yury-fedotov commented 1 month ago

I'll ping you too @stichbury . It's not ready yet.

datajoely commented 1 month ago

So I'm just seeing this for the fist time. Thank you for the push @yury-fedotov. I feel quite strongly that we're overcomplicating things:

Problem:

What we're talking about here isn't Kedro specific at all, so there's an argument we shouldn't be writing extensive docs and perhaps would be better placed to link to relevant resources.

If we are to provide a solution, this would be my approach:

yury-fedotov commented 1 month ago

Thanks for review @datajoely !

I agree with what you said. I think the current content reflects that I had a different user problem in mind while writing this, and that caused the content to diverge from what you would've expected for the problem definition you mentioned.

Here's the difference.

You definition (copy pasting)

Those problems are undoubtedly relevant IMO, I agree.

What I was trying to cover

It is actually an open question if problems I refer to above are relevant, but as I understood from 3418, they are.

Proposed next steps

I'm totally happy to adjust the content as much as needed, including ground up rewrite if it would make it more useful. But would appreciate a coordinated maintainers opinion on what should this content be.

@datajoely , @merelcht , @astrojuanlu , @noklam do you mind discussing internally (or I am happy to join too) how does the core team see the resolution of 3418 and let me know?