TuringLang / docs

Documentation and tutorials for the Turing language
https://turinglang.org/docs/
MIT License
229 stars 99 forks source link

Suggestion: Skip the Data Cleaning by using Artifacts #271

Open ParadaCarleton opened 3 years ago

ParadaCarleton commented 3 years ago

The big block of data-cleaning operations at the start of the tutorials kind of breaks up the flow, and IMO makes the tutorials seem more confusing. Maybe we should pre-clean these datasets, then include these cleaned datasets as (lazily installed, since most users won't want them) artifacts in Turing, which would let us skip the cleaning steps?

cpfiffer commented 3 years ago

I kind of like including them just because they show the whole workflow -- and in some cases the cleaning matters a great deal. Plus, adding artifacts complicates an already complex workflow.

ParadaCarleton commented 3 years ago

I kind of like including them just because they show the whole workflow -- and in some cases the cleaning matters a great deal. Plus, adding artifacts complicates an already complex workflow.

I don't disagree that it's an important part of the workflow, just that I think it's probably best to have tutorials for cleaning data separate from tutorials for things like, say, Gaussian processes. We can include links in the introduction to tutorials on things like MLDataUtils and DrWatson. Ideally, every tutorial should focus on one topic, and do it well, so that users can find tutorials that quickly go over what they don't know, instead of mixing it with subjects they've already learned. For instance, the Stan manual rarely includes data cleaning; they're usually narrowly focused on a single specific topic. We can include a link to another tutorial at the top of the introduction. As for artifacts, I don't believe loading them should be especially difficult -- from the user end, the code should just look something like:

using Pkg.Artifacts
dataset_path = artifact"dataset"
JasonPekos commented 5 months ago

Good compromise could be putting those setup codes in collapsible code chunks (maybe collapsed as default)? Same for e.g. the full manifest that's at the bottom of the tutorial pages.

yebai commented 4 months ago

Good compromise could be putting those setup codes in collapsible code chunks (maybe collapsed as default)? Same for e.g. the full manifest that's at the bottom of the tutorial pages.

@shravanngoswamii, can you give this suggestion a try, too?