iterative / example-repos-dev

Source code and generator scripts for example DVC projects
https://dvc.org/doc
21 stars 13 forks source link

example-get-started: Update src code #197

Closed leonardcser closed 1 year ago

leonardcser commented 1 year ago

The reason for the PR is to make the source code clearer for the example-get-started while keeping it as simple as possible. The actual code was left unchanged.

Here are my changes for a more understandable starter source code:

shcheklein commented 1 year ago

thanks @leonardcser for the effort!

Modularise code into functions

I'm not sure how far we want to go with this, tbh. On one hand this is the exactly right way to do things from the software eng perspective, but on the other hand even the need for these functions means that code is quite complicated for a DS project (should we try to simplify it?), and it makes it farther away from a quite regular (for good or bad) notebook-style flow ... what do you think @daavoo @dberenbaum ?

daavoo commented 1 year ago

what do you think @daavoo @dberenbaum ?

I think the src in this P.R. looks more like the state I would expect for people using a dvc.yaml with isolated scripts per stage.

and it makes it farther away from a quite regular (for good or bad) notebook-style flow

I think the motivation is not good enough for trying to make artificially "messy" scripts.

We could have an actual notebook in the project like in example-get-started-experiments and acknowledge that you are expected to clean up and refactor the code when wanting to use dvc.yaml , like we acknowledge in https://dvc.org/doc/start/experiments/experiment-pipelines#stepping-up-and-out-of-the-notebook

daavoo commented 1 year ago

means that code is quite complicated for a DS project (should we try to simplify it?),

Which parts look complicated? It doesn't look complicated for a DS project to me.