iterative / dvc.org

📖 DVC website and documentation
https://dvc.org
Apache License 2.0
331 stars 392 forks source link

get-started: rewrite it (reach "next level") for user interaction #599

Closed jorgeorpinel closed 4 years ago

jorgeorpinel commented 5 years ago

After #487 ✔️

shcheklein commented 4 years ago

@dashohoxha and @dmpetrov and our users brought good points that it's still not clear from the get started that basic data versioning can be used independently from pipelines. I think it's a good time to try to split into segments - Versioning, Accessing (import/get), Pipelines (aka Reproducibility?), Experiments (metrics, etc, etc).

The challenge here is to make them simple, easily reproducible in a sense that we need to provide commands to run. Ideally it still should be base on one project so that you can all these new section one by one.

The things that should change for example, is that Get Old Files should become a part of the first part where basic data management is described. So, we would need to have two versions of the data file at this point.

What options do we have? One singe project, multiple different project that server different needs?

shcheklein commented 4 years ago

@jorgeorpinel please, feel free to update and change the ticket (checkboxes) accordingly :)

jorgeorpinel commented 4 years ago

I think it's a good time to try to split into segments - Versioning, Accessing (import/get), Pipelines... The challenge here is to make them simple...

I don't know if I agree with this. That's what we have tutorials and other docs for. What is a Get Started for? I think it should be a basic introduction to some key features so a user can quickly get familiar with DVC. If its getting to complicated or confusing, we should probably just remove some of the chapters and simplify others (like Import Data perhaps).

  • [ ] Get Older Files (currently the last chapter) should become a part of the first part where basic data management is described

Maybe the name or description could be improved but in the current project at least, the only file that has more than one version is model.pkl, the last output in the whole pipeline, so that's why this chapter has to be at the end.

update and change the ticket (checkboxes) accordingly

Looking at them now.

shcheklein commented 4 years ago

I think it should be a basic introduction to some key features so a user can quickly get familiar with DVC.

💯 on this! So, when we want to restructure it a little bit the change should not be increasing the complexity. On contrary, it should simplify the perception since you can focus on the fist data management part for example and clearly see how it works end-to-end w/o even touching dvc run.

It's definitely a problem that we talk about dvc checkout at the very end, for example.

Pages should stay as simple as possible.

jorgeorpinel commented 4 years ago

If the sub-sections are linear (i.e. you can follow them logically in chronological order) then I like the idea. Agree about dvc checkout. We'll need to redesign that part so there's different versions to checkout early on.

shcheklein commented 4 years ago

@jorgeorpinel I think if we make them based on a single example (the same get started a bit redesigned?) than it would be perfect and you will be able to follow them in chronological order.

jorgeorpinel commented 4 years ago

Here's an example of a get-started-like example repo from ZEIT/Next.js: https://github.com/zeit/next-learn-demo (matching https://nextjs.org/learn)

They do a combination of the options we're discussing: It's kind of a monorepo with directories for each lesson, but at the same time each lesson continues from the previous one's result. This implies there's repetition of files in each dir, yes, but it's easier to understand the structure at a glance and doesn't require a bunch of tags or precise commits (generated from another project). Maybe we should follow this pattern?

And if we need different repos for different get-started sub sections even if they're not completely linear, we can also do that e.g. iterative/get-started-basics, iterative/get-started-external-data, etc.

What do you think?

shcheklein commented 4 years ago

"Learn" is a common name for this kind of section with all the guides, tutorials, etc e.g. https://nextjs.org/learn

I think this link is not a good example. First of all, notice how Learn is not even part of Docs. Mostly because it's too generic. Second, it looks like the whole Learn section for Next is actually one single interactive tutorial.

jorgeorpinel commented 4 years ago

OK. Addressed in https://github.com/iterative/dvc.org/pull/1051#issuecomment-600828452.

shcheklein commented 4 years ago

Closing this. We have higher priorities now. We can revisit later when we get to the next iteration.