Closed jorgeorpinel closed 1 year ago
@shcheklein is this not basically the same as #542?
@jorgeorpinel iIt might play that role indeed. Let's keep it open anyway and see if a page for the glossary enough or not.
Perhaps we can call it DVC Principles, similar to:
@jorgeorpinel it seems to me like some parts from the https://dvc.org/doc/understanding-dvc . A bit philosophical. And there should be a section like this, but also we need to explain concepts like Cache, Remote, Wokspace, DVC-file, etc, their relationship (with diagrams if needed).
From https://github.com/iterative/dvc.org/issues/727#issuecomment-571407120:
- https://dvc.org/doc/command-reference/cache,
- https://dvc.org/doc/command-reference/metrics,
- https://dvc.org/doc/command-reference/pipeline, and
- https://dvc.org/doc/command-reference/remote
have useful, unique content but I'm not sure the cmd ref is the best place for those explanations.
Should we take the move the explanation about those concepts to this new guid and only leave a very stripped down index for those commands? Which could link to the Basic Concepts section, of course.
From https://github.com/iterative/dvc.org/pull/1581#discussion_r467630993:
it's way more than a single page like this. Good example of a basic concept is cache, or run cache - they alone can take a proper page
So consider breaking up Basic Concepts or DVC Concepts into a full User Guide sub-section.
I'm going to summarize the goals and plans around this issue:
- [ ] There should be a page or section (maybe called Basic Concepts in UG, see example in this closed PR) — properly written on a user-level language (closer to UC) — explaining some concepts and DVC philosophy
concepts like Cache, Remote, Wokspace, etc, and their relationships (with diagrams if needed)
- [ ] Extract concept explanations from cmd refs (e.g. cache, dag, metrics/plots, remote)
- [ ] Find other issues that would be included in solving this (e.g. #53).
- Consider also a related DVC Principles section (more philosophical, maybe in the Concepts section index)? See https://github.com/iterative/dvc.org/issues/550#issuecomment-544106096 above. UPDATE: Maybe use the section index for this or/and What is DVC?.
- [ ] Figure out a connection to glossary tooltips (currently special md files with Frontmatter in
content/docs/user-guide/basic-concepts/
— see #1395): Just link to the Concepts? Use a similar engine (with Frontmatter)? Merge completely?
These items are moved to the description by @iesahin
From discussion in #1915:
... we just need to figure out a good way to add these landing pages. Should it be one guide with a section per concept? A subsection under Use Cases (which are kind of landing pages right now)?
For SEO purposes, one concept per URL is ideal. The use cases show this in practice: the versioning use case and tutorial have the strongest search results and ranking after the home page (as far as docs go).
As someone new to DVC, a Basic Concepts section for the UG makes a lot of sense, though. The UG is mostly problem:solution "how to" material, so standalone concept articles seem out of place.
Applying the above to the "Data Pipelines" concept: I suggest we add a use case and also add a Basic Concepts section with pipelines, among other concepts.
What do you think @jorgeorpinel ?
For SEO purposes, one concept per URL is ideal
What about #
anchors though? Like guide/concepts#cache
. We just don't want a bunch of very short pages like https://dvc.org/doc/user-guide for example, I think.
BTW I'm not convinced these pages are the best landing pages. They are meant to explain what these concepts mean IN DVC, so while they can definitely point to a bunch of other relevant docs for those who do land there, I don't think their purpose is to drive traffic. Kind of like the Get Started that probably gets lots of landings, but not very intentionally. Use Cases do have that purpose though, as well as the other sections of dvc.org (not /doc).
That said, if each section requires long explanations then we can definitely make them individual pages.
The UG is mostly problem:solution "how to" material
Hmmm not really, only the How To section (currently in UG due to a lack of a better structure) and certain guides (which should probably be in How To) have that intention. But anyway, Basic Concepts can definitely fit in the UG, I think (also not how-to material).
I suggest we add a use case and also add a Basic Concepts section with pipelines
Let's just focus on the Basic Concepts section for now because Use Cases are very very tricky and yes, we already have a Pipelines one planned 🙂
What about # anchors though? Like guide/concepts#cache.
The anchors will show up as sub-links on a listing, so they are effective, but the overall page needs to rank for that to happen. Agree that short pages aren't a good idea, and that these types of UG pages aren't necessarily good for landing pages.
But anyway, Basic Concepts can definitely fit in the UG, I think (also not how-to material).
Is there an existing doc in UG that is a good example of what "fits"? I understand my initial assessment doesn't match because certain items don't belong and or aren't intended as how-to material. Seems like Managing External Data or Large Dataset Optimization are in the appropriate style... is that about right for the Basic Concepts doc?
Let's just focus on the Basic Concepts section for now because Use Cases are very very tricky and yes, we already have a Pipelines one planned.
Sounds good. If it's in the works then things are on the right track SEO-wise. I have some ideas that could make a single-URL Basic Concepts work within your original plan and also address some of the SEO patterns. Will take that up and lay out ideas/outline in a new PR.
Is there an existing doc in UG that is a good example of what "fits"?
I think this is our main guide: https://dvc.org/doc/user-guide/dvc-files-and-directories. Another proper guide would be https://dvc.org/doc/user-guide/large-dataset-optimization. What is DVC and DVCignore are also OK. These explain how things work.
Thinking about it, most of the other ones do at least contain how-to material. But as long as that's not their main purpose, they can fit in the UG.
UPDATE: An important consolidation of "cache" mechanism explanation happened in #2075
Currently working on this https://github.com/iterative/dvc.org/tree/ug-basic-concepts-feb-21 branch.
Please link this issue to the PR for that branch when ready.
I'll update the issue description with all items from other issues. I think this can be the main BC issue.
@jorgeorpinel @shcheklein
Howdy @casperdcl. Other than the tooltips you've added recently (in #2359) are you still planning to work on this? Please un-assign otherwise 🙂
I'm not altogether sure what it really means to be assigned to an epic
- assigned to push others to solve the individual sub-issues? Given that the title is "new DVC Concepts section" there may be others better placed to do that. Unassigning for now.
Assigned in general would mean someone is working on it or plans to do so soonish. Agree that it can be misleading to be assigned to an epic but it happens e.g. #1400.
Lowering priority. Is this still in plans at all or should we close it?
Per https://github.com/iterative/dvc.org/pull/431#pullrequestreview-253571711
There should be a section or a few sections called Basic Concepts in the user guide that is/are properly written on a user-level language explaining some concepts. The glossary tooltips could point to these pages too.
Figure out a connection to glossary tooltips (currently special md files with Frontmatter incontent/docs/user-guide/basic-concepts/
— see #1395): Just link to the Concepts? Use a similar engine (with Frontmatter)? Merge completely?UPDATE: Skip to https://github.com/iterative/dvc.org/issues/550#issuecomment-673585050 for summary
UPDATE: Specific plan
Use contents from #1655 as needed.
Do we need the changes in package.json, src/components/Documentation/Markdown/Tooltip/index.tsx, src/gatsby/models/glossary/index.js, and src/utils/front/glossary.ts from iterative/dvc.org/pull/1944 ?
[x] Workspace (#2197)
[x] DVC Project (vs. Git/DVC Repository) [#2754]
[ ] (from cmd ref) (DVC) Metafiles
[ ] (from cmd ref) DVC/Project Cache
[ ] (from cmd ref) Data Pipeline(s)
[ ] (from cmd ref) Metrics & Plots
[ ] (from cmd ref) Remote (Storage)
[ ] (from cmd ref) Stage & update some of the links around this term that currently go to
run/stage
. (Some should link to the future pipelining guide, see #2883)[ ] (from cmd ref) (Stage) Output
[ ] (from cmd ref) (Stage) Dependency
[ ] Run-Cache?
[ ] Garbage Collection clarifications (in cache or remote pages, see #2169)
[ ] Workspace clarifications? Discussed in #1127
Add Further Reading sections to all Basic Concepts documents? (#2127)
[x] Look into #1395 after or along with this one.
Consider also a related DVC Principles section (more philosophical, maybe in the Concepts section index)? See https://github.com/iterative/dvc.org/issues/550#issuecomment-544106096 below. UPDATE: Maybe use the section index for this or/and What is DVC?.