DOC: single entry point and linear path through tutorials

drammock commented 5 years ago

This issue is for planning the next round of doc improvements. The goals for this round:

a single entry point for new users for a "getting started" or "60 minute blitz" tutorial or sequence of tutorials.
a linear path through tutorials (even if not part of the "getting started" series)
general improvements to all tutorials:
- more verbose explanations
- more links to methods literature
- more info about sensible ranges for function/method parameters

cc @agramfort @dengemann @larsoner @massich @choldgraf

larsoner commented 5 years ago

All sound good in prinicple. In practice (2) might be tough given that people can go different routes. For example Lauri's and MNE's diagrams look like this, and they only cover source localization:

source_modelling_workflow

cookbook

Neither is complete, MNE has many more things you can do (sensor space analyses, ML, stats, TF, connectivity, ...). Maybe first we should:

Complete a diagram. Probably the MNE one, since it's programmable. First make it complete, then make it pretty like the first one.
Make the tutorial flow like the diagram, somehow (?).

Since there is no single path, we could consider trying to make something like the sklearn cheat sheet.

drammock commented 5 years ago

Yes, people can go different routes, and not all routes lead to SourceEstimate. What I'm envisioning is:

the tutorials gallery page should get some structure (in the form of subheadings) similar to the "auto examples" gallery
we pick one workflow to use in the "getting started" tutorial sequence, and put those tutorials under the first subheading of the new tutorials gallery
other "workflows" could be constituted from other sequences of tutorials, and could likewise have their own subheading on the tutorials gallery page.
any tutorials not part of a "workflow" sequence can go under subheadings that are thematic (for example, a collection of a few different tutorials related to connectivity --- may not form a natural progression, but instead may be a "pick whichever one of these fits your use case" kind of collection)
as a last resort, there can be a "miscellaneous tutorials" subheading at the end, for things like "the info data structure" tutorial maybe.

Regarding the diagram: what do you think about each "workflow" tutorial series starting out with its own diagram outlining that particular workflow? So there would be multiple different diagrams (rather than one-diagram-to-rule-them-all)? I'm worried that trying to fit all of what MNE-Python does into a single diagram may be a bit of a rabbit hole.

I'll also add that it might not make sense for the tutorials to match 1-to-1 with the nodes in those existing diagrams --- for example, we might decide that the structural MRI -> conductor model and structural MRI -> source space steps in Lauri's diagram should be combined into a single tutorial page. If so, that might influence how the diagram gets drawn, so I think it makes sense to firm up what workflow the diagram is supposed to represent before deciding what nodes should be in it.

larsoner commented 5 years ago

Regarding the diagram: what do you think about each "workflow" tutorial series starting out with its own diagram outlining that particular workflow? So there would be multiple different diagrams (rather than one-diagram-to-rule-them-all)? I'm worried that trying to fit all of what MNE-Python does into a single diagram may be a bit of a rabbit hole.

I think we should try to make a monolithic diagram first, and if it becomes unwieldy, separate it. So far we have mostly avoided MNE objects feeding back onto themselves recursively or anything challenging like that, so I'm optimistic it's possible (hopefully without looking terrible!).

One motivation is that most workflows will involve the same steps (reading data, preprocessing, epoching), and it's obvious this is the case in a single diagram, and not so obvious if diagrams are separated. And if you want to start from raw data and get to some node (or multiple nodes), it's clear which steps are needed.

If we tried this method, then the "Getting started" could just cover step-by-step the most common workflow, probably raw->epochs->evoked->bem+src->fwd+cov->inv->source estimate or so (this is more or less what's outlined in both diagrams above, actually). In the end this could just be a subset of the global workflow / diagram.

If it would help, I could take some time to try to add all the main MNE-capable nodes to the existing diagram, and we could see how it looks. Or if it's already clear to you what would be added where, feel free to do it. In theory there are just 40 lines to modify/add to (especially if you delete the custom blocking/hierarchy code that follows the primary construction of the diagram):

https://github.com/mne-tools/mne-python/blob/master/doc/sphinxext/flow_diagram.py#L25-L66

agramfort commented 5 years ago

Yes, people can go different routes, and not all routes lead to SourceEstimate. What I'm envisioning is:

the tutorials gallery page should get some structure (in the form of subheadings) similar to the "auto examples" gallery we pick one workflow to use in the "getting started" tutorial sequence, and put those tutorials under the first subheading of the new tutorials gallery

tutorials are flat now as they were meant to be linked from documentation.rst just to tell you the historical decision. I don't any strong feeling about it but I would start about documentation.rst again to put some highlights and make flow diagrams to help people get the big picture

drammock commented 5 years ago

I spent some time iterating the flowchart today. (I did it with mermaid as it's faster to iterate because it regenerates automatically; I can translate back to pygraphviz when it's finalized.) Even with minimal additions (more thorough annotation of arrows and addition of the pipeline for getting anatomical labels) it starts to get hairy:

orig-flowchart

From there, even adding in four more operations (stc.in_label(), stc.to_label(), stc.extract_label_time_course(), and mne.spectral_connectivity()) makes it completely overwhelming as to be nearly unusable.

expanded-flowchart

Granted, there is some untangling that could be done with something less automated than mermaid, but this exercise has only strengthened my conviction that a "one master diagram" approach is a bad idea. @larsoner suggested we add:

sensor space analyses, ML, stats, TF, connectivity, ...

and I've only really added 1 of those (plus the label stuff) and it's already too much, I think. Anyone willing to reconsider the "separate diagram for each workflow" idea?

drammock commented 5 years ago

here's a gist for the mermaid source, if anyone wants to play around with it: https://gist.github.com/drammock/17508eed83485c470d93036fdd5bfd9c

Also FYI there is a mermaid plugin for sphinx... probably only worth trying out if we opt for the "many simpler diagrams" approach, since it does poorly with big charts and its layout can't really be tweaked.

massich commented 5 years ago

@drammock I did not know mermaind I was using plantuml (I guess I'm an old fella)

agramfort commented 5 years ago

that's amazing !

having editable graphs with a text editor we can easily generate and update for the doc is awesome

larsoner commented 5 years ago

One question is about graphs being conceptual or input-output mappings.

What you built is what I'd consider an input-output map: what functions do I use, what are the output types that I need to feed to the next operation. I was thinking more at the conceptual level, at least for the landing page / introduction somewhere. (We can't cover both in a single map for all of MNE because as you see it will become unwieldy, since MNE has hundreds of public functions, classes, and methods.)

Instead of the input-output approach where nodes are functions or classes, and edges are computations, to stay at a more conceptual level, we could think of nodes as roughly objects / files, and edges as functions / operations to get from one place to another. Something like this (from gist):

screenshot_2019-02-15_09-51-40

This sort of framework could:

capture the vast majority (90%+?) of what MNE does for most users
make clear how to get from one place to another conceptually
does not make clear how to do it mechanically

Showing this type of diagram early in the doc lends itself to pointing people in the direction of the next subsection of documentation (not unlike we subsections have already, perhaps) that they need to get more information on each topic. We could potentially make it a clickable image map.

Maybe then subsets of the documentation could have detailed input-output type diagrams like the one you made @drammock? Or maybe after this first diagram, we could show a few different "workflow" diagrams for common things (source localization, connectivity, stats...?).

I'm still not sure how things should be organized, but it seems valuable to me to have both this sort of conceptual overview (early in doc), and the more mechanical diagrams (later in doc).

agramfort commented 5 years ago

to make dreams come true is it possible to render such graph at different levels of details eg to click/zoom and be able to see on a phone display? also can we make the boxes clickable to refer to functions in the API page... Yes I know I am dreaming out loud....

drammock commented 5 years ago

@agramfort it is possible to bind Mermaid node boxes into hyperlinks or javascript callbacks. Not sure how easy it would be for that to trigger a zoom-in action. But the default in-page rendering of a mermaid diagram is SVG, so a normal phone-screen pinch-zoom should at least yield close-ups of sections of the diagram that are more legible than the PNGs I generated locally with the CLI tool.

massich commented 5 years ago

I tried to find a way to put your green boxes when hubbering the lines 'cos they represent how you go from one box to the next. But I couldn't find how to do that.

On Fri, Feb 15, 2019, 19:49 Daniel McCloy notifications@github.com wrote:

@agramfort https://github.com/agramfort it is possible to bind Mermaid node boxes into hyperlinks or javascript callbacks. Not sure how easy it would be for that to trigger a zoom-in action. But the default in-page rendering of a mermaid diagram is SVG, so a normal phone-screen pinch-zoom should at least yield close-ups of sections of the diagram that are more legible than the PNGs I generated locally with the CLI tool.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/mne-tools/mne-python/issues/5937#issuecomment-464157833, or mute the thread https://github.com/notifications/unsubscribe-auth/AGt-4_vw7devHLCM9WZlTj9Z37LZ-JgXks5vNwE5gaJpZM4a6IXQ .

drammock commented 5 years ago

@larsoner OK, now I understand. I was misled by the arrow labels on the original diagram into thinking that you wanted all the arrows labeled, plus new arrows/nodes added. I am fine with the idea of a conceptual diagram for the main documentation landing page, and maybe a few more detailed input-output diagrams for particular sequences of tutorials (AKA "workflows").

larsoner commented 5 years ago

So it sounds like people like the diagrams (conceptual and detailed), as well as image-map-like functionality. Want to try adding the main conceptual + one specific one to #5944? (Or whatever seems like the best way to do it from your perspective?) That way we can see and test an alpha version.

drammock commented 5 years ago

@massich there's supposed to be a way to specify "tooltips" when hovering over nodes, but it didn't work when I tried it yesterday (but admittedly I didn't try very hard to figure out why not). But even if it did work; as far as I know the tooltip only works for nodes, not for connections.

drammock commented 5 years ago

@larsoner I'll add a conceptual one to #5944 (today), and add a specific one to the first "tutorial sequence" in that same PR (probably mid-next-week). Thoughts on mermaid vs. pygraphviz for the final product?

larsoner commented 5 years ago

Mermaid seems better and more easily integrated during doc build

massich commented 5 years ago

Can we make a PR with only adding the mind map? Kind of swapping the one we have in the cookbook. And iterate from there.

On Fri, Feb 15, 2019, 20:07 Eric Larson notifications@github.com wrote:

Mermaid seems better and more easily integrated during doc build

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/mne-tools/mne-python/issues/5937#issuecomment-464163734, or mute the thread https://github.com/notifications/unsubscribe-auth/AGt-420UzVzvQaGwSrwCgo-DHhxTH0Hxks5vNwV1gaJpZM4a6IXQ .

larsoner commented 5 years ago

I'm not sure that's the best way to go. The contents and layout will depend on how we want to structure the rest of the doc. If the monolithic PR (which #5944 might end up being) seems to unwieldy we can always separate it out.

jasmainak commented 5 years ago

I'm not sure a super complex diagram is a good idea. Instead of focusing on new fancy diagrams, I think a clear tutorial giving an overview of what is going on in the cookbook might be more useful. I did recently see Matti using an animated version of the cookbook diagram -- which seemed super useful. But you could also do something similar using proper subsections in text.

One needs to be upfront why so many different components are needed for solving the inverse problem. For a newcomer this can be intimidating. All we are trying to do is solve for X in the equation M = GX + E. Now, everything from raw to evoked is for getting M. Everything from T1 to forward solution is to get G, and the noise covariance is for getting the right E (with diagonal gaussian noise). Now, in principle you can compute X hat but then it's costly to invert matrices. So, you precompute this and do a plain dot product with M to get X hat.

So, really there are 3 different components to the whole thing and then you can go into details of each. Not volunteering to write this though :)

drammock commented 5 years ago

@jasmainak I like the idea of adding in the correspondence between the equation and the flowchart. And I agree that a "super-complex" diagram is a bad idea. Are you suggesting that the WIP diagram on CircleCI from #5944 here is still too complex?

agramfort commented 5 years ago

just a quick remark. This flowchart should be too minimum_norm centered. It should support the beamformer or sparse solver use cases so I would not detail the 2 step approach minimum_norm. To get sources esimates (not just surface based) you need sensor data (evoked, epochs, raw) and an inverse method and you get source estimates or dipole locations. This should maybe simplify a bit to be less precise here.

my 2c

jasmainak commented 5 years ago

@drammock I had not seen this before. The diagram right now is a great summary diagram for those already familiar with EEG/MEG processing but not good as an entry point in my opinion. It might help to get feedback from a newcomer to understand how much it is comprehensible.

We don't necessarily need to use equations, but maybe it could help. As @agramfort says, I agree it would help to be a little less precise here. My main suggestion is to break it down into 3 flowcharts / workflows:

raw to evoked: raw -> preprocess (maxfilter, ssp, ica, filter) -> events -> epochs -> evoked
T1 to forward: autorecon -> coreg -> forward
evoked/epochs/raw to source estimate: covariance + forward + evoked -> inverse -> stc

It should go with accompanying text (no code) and be explained in a single tutorial with links to existing examples for each of the subparts.

jona-sassenhagen commented 5 years ago

My usual EEG user comment is that all of the source stuff is very uninviting to EEG people. If the entry point and linear path contains a bunch of references to the inside baseball of MEG source reconstruction, that will scare off people.

jasmainak commented 5 years ago

I cannot agree more with @jona-sassenhagen :)

drammock commented 5 years ago

@jona-sassenhagen I think you'll be happy, the linear path so far looks like this (ones in parentheses are not yet pushed to #5944 because they're still sketches, may get reordered or subdivided):

loading raw > querying raw > subselecting data from raw > cropping and concatenating raw > built-in plotting methods of raw > exporting and saving data from raw > (annotations) > (interpolating bads)
background on projections > loading, saving, applying projectors > computing projectors > plotting projectors > working with sensor locations > (setting EEG reference) > (maxwell filtering) > (ICA) > (filtering) > (resampling)
(events) > (epochs from events) > (continuous epoching) > (epoch rejection) > (plotting epochs)

etc

So in other words won't get to source imaging until very late in the progression (which might mean that the flowchart ends up getting moved off of the docs landing page... TBD and opinions welcome)

drammock commented 5 years ago

closing because the tutorials are now in a fixed order and there is a new docs landing page.

mne-tools / mne-python

DOC: single entry point and linear path through tutorials #5937