Documentation plan - Githubissues

philippjfr commented 7 years ago

A major part of improving our docs is to add various examples. The way many libraries structure their examples is in a top-level examples directory (see bokeh, and matplotlib). These examples then usually get built into a gallery (see bokeh, matplotlib, and cartopy).

Adding an examples directory also encourages adding examples when new features are added. I'd even suggest that in future new features should be accompanied by small example notebooks or scripts. We are going to be splitting out the different plotting backends but I'd still strongly argue examples for officially supported backends should live on the core repo, where they are all in one place.

We've also had various definitions of examples in the past, what I think it should mean in this context is small self-contained notebooks with at most one or two examples, which are focused on the code, not on telling a story or explaining deeper concepts. That contrasts with quickstart guides, tutorials and the "examples" that are linked to from holoviews.org/Examples, which are really case studies. My suggestion for the different types of notebooks:

Tutorials - Long, detailed notebooks explaining particular concepts in detail, living in doc/Tutorials. New tutorials should be added to holoviews-contrib and can move to the main repo once polished.
Quickstart guides - Shorter notebooks getting the user started on using a particular feature without providing extensive background. Again should start out in holoviews-contrib but once we have a few I'd suggest creating a User Guide (see bokeh) that provides a quick introduction to holoviews.
Examples - These are what this issue is about, they are short and self-contained and generally should just go straight into the main repo since they don't need detailed explanation.
Case studies - These are what's currently on holoviews.org/Examples, and basically show how to apply holoviews to a particular domain or problem. I believe these should all live in holoviews-contrib providing a wide-ranging collection of user examples. Keeping them all in one place this way will encourage us to test and update them for each release.

If we agree on these different formats and where they live we should settle the structure of the examples, my suggestion is that each example should be implemented for all backends that support the features used in the example. Then each example links to the equivalent versions for other backends. Each example should contain the following information:

Table with links to the example implemented using other backends using unicode tickmarks to show supported and unsupported backends
List of requirements, e.g. if the data uses bokeh sample data lists that as a requirement
A link to the original source of the example if any
(Optional) A list of tags to make the examples more searchable

We also want to structure the examples into sensible subfolders. Here's some subfolders I can currently imagine:

apps - bokeh apps and in future maybe matplotlib webagg based apps
elements - All the supported elements split out into individual notebooks
plotting - Basic examples showing off specific plotting features
streams - Various examples using regular streams and linked streams

jlstevens commented 7 years ago

I'd even suggest that in future new features should be accompanied by small example notebooks or scripts.

I agree this is a good idea.

We are going to be splitting out the different plotting backends but I'd still strongly argue examples for officially supported backends should live on the core repo, where they are all in one place.

Yes, though I would argue that things like bokeh server support would then live with in the bokeh backend repo.

I'd suggest creating a User Guide (see bokeh) that provides a quick introduction to holoviews...

I also agree with this.

All your suggestions seems sensible. I do disagree with this one though:

apps - bokeh apps and in future maybe matplotlib webagg based apps

Until we actually have another backend that does support apps, I wouldn't include this in the examples but with the backend. Of course, if we do find ourselves with apps using different backends, we could unify them again.

philippjfr commented 7 years ago

Until we actually have another backend that does support apps, I wouldn't include this in the examples but with the backend. Of course, if we do find ourselves with apps using different backends, we could unify them again.

I don't really get that argument, to me fragmentation of examples across multiple repos where most users will never discover them is a far bigger concern than having an extra folder that at least to begin with is focused entirely on one backend.

jlstevens commented 7 years ago

I get that point but I would like the core examples to be backend agnostic - i.e work across all backends. I'm happy to be convinced otherwise, but I would hold off on apps examples for now.

philippjfr commented 7 years ago

I get that point but I would like the core examples to be backend agnostic - i.e work across all backends. I'm happy to be convinced otherwise, but I would hold off on apps examples for now.

There will be plenty of examples in the plotting folder that won't work across backends. Not every backend will support every feature, which is exactly what makes holoviews so powerful, it lets you leverage the unique features of different plotting backends without being locked in.

jlstevens commented 7 years ago

Ok, in that case I don't mind having examples that aren't cross backend, but can we at least prioritize the examples that are cross backend first?

philippjfr commented 7 years ago

Ok, in that case I don't mind having examples that aren't cross backend, but can at least prioritize the examples that are cross backend first?

Yes, definitely, wherever possible an example should be supported at least by two backends. Usually that's mpl and bokeh, but for 3d examples it'll be mpl and plotly.

thoth291 commented 7 years ago

It's not my business - but since this discussion is public - I dare to be ignored...

I'm sure you are aware of this site: http://www.datavizcatalogue.com/ It's the best example of gallery in terms of showcase. It also a great idea to group examples by categories. So when you click on radial-charts you get a fast intro for all possible backends and links to all (!) known examples using this one.

P.S. Keeping in mind notebook JSON I believe process of creation such gallery can be automated and even separated into separate package for other users of holoviews to organize their in-house examples into gallery...

jlstevens commented 7 years ago

@thoth291 Don't worry feedback is always welcome!

I do believe I had seen datavizcatalogue.com before but had forgotten about it. It is a very slick website and does seem to be well organized. This is something I believe I'll be discussing with @jbednar and @philippjfr later so thanks for the link!

jlstevens commented 7 years ago

After a long meeting with Philipp, we've come up with the following plan on how to structure user-runnable examples:

holoviews
 |_ examples      # *Everything* must be Python2/3 compatible
  |_ index.py
  |_ Index.rst    # Generated by index.py everything in examples
  |_ README.rst   # Info on template and how to contribute

  |_ getting_started  # Our core introduction to HoloViews
    |_ ...

 |_ user_guide         # Our detailed docs 
   |_ ... # Core cross-backend guides
   |_ bokeh
   |_ matplotlib

  |_ elements      # Website: Elements gallery page, PNGs on S3 assets.holoviews.org
    |_ bokeh
    |_ matplotlib

  |_ tutorials    # Now means something different, see the plan below
    |_ ...            # Cross backend tutorials
    |_ bokeh
    |_ matplotlib

  |_ demos         # Website: 'Gallery' page, subsection 'demos'. PNGs on S3 as well.
    |_ ...         # Few cross-backend demos
    |_ matplotlib
    |_ bokeh

                   # Feature-focused examples (e.g plotting hooks)
 |_ features       # Website: Linked to from the future release page(s). Also a subsection of 'Gallery'. 
   |_ ...
   |_ matplotlib
   |_ bokeh

  |_ streams       # Website: 'Gallery', subsection 'streams' # Ideally live, definite .gif (.gifv, press play)
    |- ...         # Cross backend examples
    |_ bokeh       # Linked streams
    |_ matplotlib  # 3D example
    |_ paramnb

  |_ apps          # Website: 'Gallery', subsection 'apps'. # Ideally live, definite .gif (.gifv, press play)
    |_ bokeh
    |_ matplotlib  # flask app based on webagg

                   # Website: Nothing planned (potentially in the Gallery)

  |_ scripts       # Pure Python
    |_ ...         # Cross backend examples
    |_ bokeh
    |_ matplotlib

  # Website: Replaces current Examples page
  |_ topics
    |_ general           # Fractals
    |_ simulation        # Game of Life, Boids, Hipster model
    |_ cartographic      # Geoviews
    |_ machine_learning
    |_ neuroscience     (e.g Imagen)

  # Topics get subdirectories as they get populated (min. 2 examples needed)
  # These topics will be curated so *just* the notebooks we are happy to keep supporting.

Note that this involves moving our current tutorials out of doc/Tutorials. There should be no .rst files in examples except README.rst and Index.rst (for GitHub rendering, not sphinx).

The plan for holoviews-contrib is to mirror structure above. Users can contribute there and we move material over to this repo for content that is good and that we want to maintain.

We also agreed that all examples should have a table at the top (excluding tutorials) with this structure:

Metadata for everything in examples (except tutorials)

title:          My example
dependencies:   bokeh, datashader
references: [original model](...) [code inspiration](...) [paper](...)   # Crediting  links
topics:         machine-learning   # Need to decide if we want to keep this field
backends:       [matplotlib](...) [matplotlib-nbagg](...) [bokeh](...) [bokeh-server](...)

Note that the backends list includes the renderer modes and all backends should be listed, including circular links to the same page. E.g examples/demos/bokeh would be expected to have a single entry for 'bokeh' that just links back to that page.

Lots of work to be done!

jbednar commented 7 years ago

Why are README.rst and Index.rst rst, if they are only for Github? Why not .md?

How will holoviews.contrib handle outdated examples? Seems like it will need to store the hv version they were created for, so that they are runnable by some version even if not the current one.

Will examples/index.py be able to read a category tag (potentially mutiple ones, for a hierarchy?) stored with each example, so that they can be grouped automatically? That way we can have some structure without requiring a master list.

I can't tell what you mean for the structure of the "tutorials" directory, from the items in the skeleton above.

Do we really need demos, features, and quickstarts to be separate categories? Seems unwieldy.

"# Website: Nothing planned" -- not sure what that's referring to. Seems like everything needs a website or it doesn't exist.

If you list "dependencies: bokeh, datashader" I think it should be in the form of something that can be automated, e.g. a couple of items that can be added to a master environment.yml that is then sufficient to run the examples. Maybe environment.yml isn't needed, depending on what gets installed already with hv.

What about GeoViews? Can those examples be part of this same structure, but with a "geoviews" dependency?

philippjfr commented 7 years ago

No strong preference for rst over md, it's just what we've been using elsewhere including the main README.rst, CHANGELOG.rst and everywhere on the website.

Will examples/index.py be able to read a category tag (potentially mutiple ones, for a hierarchy?) stored with each example, so that they can be grouped automatically? That way we can have some structure without requiring a master list.

I suggested tags of some kind, but we'd need some kind of master list of tags so we don't end up with a bunch of tiny categories. Otherwise I'd be happy to add it to break the Gallery down into smaller sections and potentially create something like the visualization catalogue suggested above.

Do we really need demos, features, and quickstarts to be separate categories? Seems unwieldy.

Either that or consistent tagging, I do think quickstarts should probably be separate from the other two though.

"# Website: Nothing planned" -- not sure what that's referring to. Seems like everything needs a website or it doesn't exist.

That's for script examples, suppose those could get built somehow and included on the website.

If you list "dependencies: bokeh, datashader" I think it should be in the form of something that can be automated, e.g. a couple of items that can be added to a master environment.yml that is then sufficient to run the examples. Maybe environment.yml isn't needed, depending on what gets installed already with hv.

Not sure what you're suggesting, do you mean that they should also include version numbers or something? Or do you want to generate an environment.yml that covers all the example dependencies somehow?

What about GeoViews? Can those examples be part of this same structure, but with a "geoviews" dependency?

Those should probably go in the topics/cartographic directory

jbednar commented 7 years ago

do you mean that they should also include version numbers or something? Or do you want to generate an environment.yml that covers all the example dependencies somehow?

It doesn't have to be generated, just generatable, i.e. a user (or a computer) should be able to start with a reference environment.yml, extract the list of extra dependencies for this example, run a conda command (or patch the end of the .yml file) to get those dependencies, and run the example. If that's automated then so much the better! If they are automated and continuously tested then they don't need version numbers, since we'll know when things break and we can see the version numbers in the logs for the last working one. If it's not automated it probably needs version numbers.

topics/cartographic

Sounds reasonable.

jlstevens commented 7 years ago

@jbednar What you proposed sounds desirable but I'm not sure what could support that at a technical level right now.

jbednar commented 7 years ago

If you don't support that, how will you get all the various examples to run, with dependencies that may conflict? Just have one master environment file that covers all, and reject any example that doesn't fit that?

jlstevens commented 7 years ago

That is the only suggestion that I think would work, at least initially. Testing is a whole new can of worms!

philippjfr commented 7 years ago

All examples included in the main repo should be kept up to date and should work with the most recent versions of the libraries they depend on. So there should be no conflicts.

jbednar commented 7 years ago

If that's the policy for maintenance, that should be fine. But for users, it would be far nicer if they could have a minimal, not a maximal, environment.yml, plus a few that are very specific to this particular example. Eventually the hv examples could cover all Python libraries ever. :-)

jlstevens commented 7 years ago

I don't think we are proposing adding any dependencies to what is installed. As far as I'm concerned we only need to expand what is installed on travis when it comes to testing.

Users will be able to consult each example to see any additional dependencies and install them as required. These example dependencies shouldn't be forced on users in any way!

jlstevens commented 7 years ago

Note that each example will list any additional dependencies explicitly and we how a large number of examples have no special dependencies. I.e they may need the matplotlib or bokeh backend, but that should be about it.

jbednar commented 7 years ago

Sure, sounds good. I'm just suggesting that when we list such extra dependencies, we do them in a machine-automatable way even if we don't actually automate that at present.

jlstevens commented 7 years ago

Part of the plan is a script to extract the titles from the metadata tables and generate an index. It could also collect all the specified dependencies and list them too.

jbednar commented 7 years ago

Sounds like a good idea to me. I'm bringing this up now because we're facing similar issues for datashader, which has some examples with very heavy dependencies that do not play nice with each other.

jlstevens commented 7 years ago

Part of this should happen in time for 1.7.1 even if everything isn't added. We'll reassign this issue to 1.8 as soon as 1.7.1 is out.

jlstevens commented 7 years ago

@jbednar @philippjfr

I've renamed this issue as it now contains the overall documentation plan we came up with for HoloViews.

Note that I also edited the suggested directory structure of examples/ above to reflect the proposed structure below (I decided to remove the proposed 'quickstarts' as the notebooks I have can move to topics/simulation)

Documentation sections

The following list has the corresponding priority (both in terms of priority as a task and general importance) marked by (1), (2) etc.

Gallery

| Element (priority) (*1) | Linked Streams |_ Demos

Getting Started (*2)

These docs are about HoloViews, starting from zero, quickly introducing the core terms and concepts with a focus on how everything fits together. Should be succinct with the goal that a person could reasonably get through all 5 sections in 2/3 hours without feeling overloaded. Will continually link to the user guide for comprehensive detail:

There will be 5 core sections with one optional section:

Elements + Composition + Containers, introducing Dimensions
Backends Options and Magics (%output/%opts)
Dataset + methods (links off the user guide)
DynamicMap (introducing Streams, mentioning dynamic operations)
Operations (introducing dynamic operations)
(Optional): Design principles (short version of rich display)

User Guide

A set of polished notebooks about various holoviews features (can be core or side-features) that go into as much detail as possible to discuss particular aspects of HoloViews at length. Each notebook is more granular and focused than in 'Getting Started' and these notebooks will be mostly adapted from our existing tutorials. Core user guides should always recommend people read 'Getting Started' first, linking to the appropriate section.

Core User Guides (*3)

These will be linked prominently from the 'Getting Started' sections. These will mirror the 'Getting Started' topics but there will be more core user guide pages than 'Getting started sections' (more granular). These will also like to the supplementary user guide pages.

Dimensions
Composition (i.e overlays and layouts)
Nesting data (replacing Composing data, hierarchy of structure, traversal, map)
Containers (casting, faceting, .overlay, .layout)
Gridded data
Columnar data
HoloMaps
Operations
Options
Magics
DynamicMap
Streams

Supplementary User Guides (4/5)

Detailed information of features that aren't considered core.

Plotting with Matplotlib
Plotting with Bokeh
Plotting with Plotly
Linked Streams
Dynamic Pipelines
Exporting
Bokeh apps
Deployment
Dashboards
Continuous Coordinates
Renderer and plots (user-centric)

Glossary / Cheat sheets

Maybe this is part of the User Guide? For 'Cheat sheets' one example could be a list of possible element constructors by type (columnar/gridded)

Tutorials (4/5)

Longer, extended examples of working with some dataset i.e this is data centric documentation using HoloViews as a tool. Will recommend that users have read the 'Getting Started' guide first and will link to the User Guide to ensure readers pick up the relevant HoloViews concepts.

Working with image data (numpy)
Working with columnar data (pandas)
Working with gridded data (numpy, xarray, iris)
Working with large data (datashader)

Topics (*6)

Domain specific tutorials. Just like Tutorials in that they are about a task that the user wants to perform, using HoloViews.

Working with geographic data
Working with network data
Working with simulations

Note that we now plan to remove the existing tutorials entirely, but when we do, we should have a page (either in an issue or PR) with a list of links pointing into the git history with the final version of the notebook before it was deleted.

It is a big plan but at least we also have a clear set of priorities and a framework to fit everything in!

jlstevens commented 7 years ago

I forgot to include the glossary so I made an edit above. It should be fairly easy to add.

jbednar commented 7 years ago

Core User Guides

These look good, but if the titles in the user guide mention HV types, I think those should be in parentheses -- they should be titled using the underlying concepts, not about HoloMap or DynamicMap as objects. E.g. maybe "High-dimensional data as animations or sliders (HoloMaps)" or "Dynamically generating HoloViews objects as needed (DynamicMaps)". It's a user guide, not a reference guide.

jlstevens commented 7 years ago

Sure, though those suggestions are also a tad long! ;-p

jbednar commented 7 years ago

True -- hopefully we can come up with something simpler and more direct. As for the rest of the plan, it looks good overall, but I'd need to have some more time than I have now to really study it and think about it.

thoth291 commented 7 years ago

Discovered that there were a PLOTCON conference done recently and this talk made me think of some things what I'm doing with the plots: https://www.youtube.com/watch?v=_KEl-Spdaz0

Coming back to my recent post in this thread - I think it's definitely right thing to keep some learning sections in the documentation in spirit of Plotting Pitfalls by @jbednar , datavizcalogue sections of when to use charts or the rules Elijah is talking about in his post.

The first thing what made me confident in datashader and holoviews - that I had perfect feeling that you guys deeply understand what you are doing - and it's not just yet another plotting library - it's a conceptually new way of dealing with data and abstracting from data layout in favor of data annotations to do appropriate and reproducible data manipulating and plotting.

Please don't forget about documentation of that kind: teaching scientists what is the right way to tell story of their data using holoviews - and pointing to what can go wrong without it - is crucial part of your commitment in this project for people like me.

Thank you very much for your hard work!

P.S. How come you were not presenting at PLOTCON?

jbednar commented 7 years ago

We can't be everywhere! :-) We'll be talking about HoloViews, Datashader, and Bokeh at SciPy in Austin, JupyterCon in NYC, and FOSS4G in Boston, two of which are in-depth tutorials.

Yes, we're working on better overview/howto guides. We'll soon be making a new site that covers multiple projects (HoloViews, Bokeh, and Datashader at least) that shows how everything fits together to solve real problems. Stay tuned!

philippjfr commented 7 years ago

Summary from our meeting today:

Getting started guide is agnostic to background (starting with python and numpy knowledge)
User guide might have sections for users to "unlearn" matplotlib/bokeh, imperative plotting
Add Getting Started section 0 for core concepts

Getting started guide

Section 0: Core concepts
Section 1: Elements/Composition/Spaces
Two simple element examples (pandas timeseries and numpy image & point forward to Section 3)
Show basic composition
Show HoloMap (bring in concept of Dimensions)
Redim HoloMap/Element dimensions
Section 2: Separation of content/presentation: Backends/Options
Start with a bokeh plot, customize it
Switch to matplotlib and then explain that you need to set new options (starting with bokeh)
Present magics as convenient way of setting options
Section 3: Dataset
Different types of accepted formats (pandas, numpy, xarray...)
Explain indexing (select, slice, index)
Explain faceting (.overlay, .grid, .layout)
Explain other methods
Section 4: DynamicMap
Section 5: Operations
Section 6: Design Principles

philippjfr commented 7 years ago

This afternoon we also discussed our current plan for the user guide, here's what we're doing with existing tutorials:

Keep as tutorials

Showcase
Introduction (Rewrite or delete later)
Exploring Data

Ready to move to user guide now and adapt later

Sampling data
Columnar Data (top section as user guide, bottom as "Working with Columnar Data" tutorial)
DynamicMap
Options
Exporting
Continuous Coordinates
Composing Data
Bokeh Backend (rename to "Plotting with bokeh" and make mpl equivalent)
Streams
Linked Streams
Gridded Data (new and uncommitted)
Bokeh Apps (new and uncommitted)

Examples/Gallery

Containers: Split into individual gallery notebooks

Before moving we need to come up with good names for all the user guides.

jlstevens commented 7 years ago

It has been a ton of work but now this plan has been implemented. Very happy to close this one!

holoviz / holoviews

Documentation plan #1379

Documentation sections

Gallery

Getting Started (*2)

User Guide

Core User Guides (*3)

Supplementary User Guides (4/5)

Glossary / Cheat sheets

Tutorials (4/5)

Topics (*6)

Getting started guide

Keep as tutorials

Ready to move to user guide now and adapt later

Examples/Gallery