Improve generated figures

emdupre commented 5 years ago

Suggestions from @handwerkerd for improving generated figures. Transferred from comment on #208.

Colormap options for the components. The red/blue color maps makes the extremes pop out while artifacts in ICA components are sometimes clearer with grayscale. Even if it's out a parameter, having a way to play with the colorscale without making deep edits to the code would be useful.
Each component screen can also include numerical information besides variance. I suggest adding at least Kappa & Rho values.
Whether a component was accepted/rejected/ignored should be visible with each component output. Ideally the part of the decision tree that resulted in that classification can also be included
I'm not sure a bar graph of accepted/rejected/ignored variance has added value over just displaying the numbers. If anything, this might be one of the few cases where a pie chart is visually informative.

dowdlelt commented 5 years ago

I did knock out point 2 as one of the last commits - the kappa and rho values are now included in the title, along with variance.

For point 3 - the timeseries are color coded by accepted/rejected status - but no decision tree information is provided. Not sure of the best way to do it - also no clear to me what those numbers mean, but that is likely because I haven't looked around enough.

For the bar plot to pie chart transfer - I just didn't get to this. I think a version that includes a wedge for % of variance not explained (perhaps in light grey?), in addition to the accepted, etc wedges would be good. Feel that those wedges should display both the percentage, as well as the # of comps.

Looking forward to making some more improvements.

dowdlelt commented 5 years ago

As I am dealing with different data I am noticing more issues with my slice viewer - first it will display a series of empty slices, particularly if the bounding box is large, and extends beyond the mask.

The solution would be to remove all the zero filled planes around the mask, such that the data that goes into the viewer is just limited to the masked region.

One issue if someone collects just a few slices (perhaps they are doing a visual cortex experiment) then the slicer would fail, so there would need to be a check for less than N_cuts following this.

The second big issue is that, depending on how the data were collected and processed, the slices could be sideways/upside down - ideally the program would check orientation, and have a nice rule based system for spinning the 3D matrix, such that slices always appear in a reasonable orientation.

An example of both issues can be seen here - in that the top row shows a nice superior saggital sinus artifact...sideways, and the middle row has completely empty and therefor useless slices.

dowdlelt commented 5 years ago

Solved the first issue by adapting function from https://stackoverflow.com/a/54571830/6003148, as

def trim_edge_zeros(arr):
    mask = arr!=0
    bounding_box = tuple(
                         slice(np.min(indexes), np.max(indexes) + 1)
                         for indexes in np.where(mask))
    return arr[bounding_box]

I am going to test more, before opening pull request - it is also such a small change, kind of want to add a few other features first.

dowdlelt commented 5 years ago

Idea for variance explained pie plot goes something like this:

Where each outer wedge shows the total variance explained, and then breaks down on the inner circle more wedges by each individual components var_expl. In this way you can see the total variance, and then observe that only one or two components (like low freq drift) is contributing it. In addition, the unexplained variance will be quickly visible via the size of the white space. Less pac man like it preferable.

I have to test it against real data rather than this made up business, but if anyone has a strong opinion about it I'm all ears (@handwerkerd ,perhaps)

dowdlelt commented 5 years ago

Update:

With real data, there is a complication in that white spacing between inner wedge ends up becoming just white space if there are too many comps, ex:

This can be fixed by removing the edgecolor = 'w' argument, but that makes the borders somewhat indistinct between different components. So, I randomly shuffle the colormap lists and that highlights boundaries.

I love this because you can see right away that one rejected components is ~50% of all the variance in the data!

Note that the components explained 100% of the variance in this case, hence the note in the bottom. If unexplained variance is larger, for example 16.7%, the graph looks like so:

Fun Saturday distraction. I'll likely replace the bar graph with this. Last problem I would like to focus on is provenance for rejection, and colormap. I figure color map can be passed as text after --png , as in --png bone in the tedana call but I just have to sort that out.

Rejection reason can just be text plopped on, say, the upper right of the current timeseries figures. Perhaps a Sunday Funday for that.

tsalo commented 5 years ago

The figures look beautiful.

A few thoughts:

Indexing columns in the component table by number is not working. You should index by the column names.
The functions in viz need an output directory argument, I think. The current approach relies on the workflow to cd into the output directory, but I ultimately want to remove that step.
Related to the above point, I think we should move the figure subdirectory generation step to the main workflow and remove it from one of the functions (or add it to all of them).
I think we need to revisit how the figures and orthogonalization interact. Orthogonalization overwrites the rejected components in the mixing matrix, and figures are generated from the mixing matrix after orthogonalization. The orthogonalized components are probably not useful for component selection, so we shouldn't plot them. See the figures below for an example.
What do you think about adding vertical bands like this to the FFT plots in the commonly accepted frequencies (0.01 - 0.1)? It would make it easier for people who aren't familiar with these plots to identify odd patterns.
~~Maybe normalize the FFT plots based on area under the curve.~~ Remove y-ticks for FFT and time series plots, since the amplitude value doesn't matter for either, and these values can slightly alter the tightened bounding box.
The figure generation is opening a Python window, which I don't love. If you add import matplotlib and matplotlib.use('Agg') to the top of the file, that will prevent this.
Set the x-axis limits for the time course line plot to 0 and the last volume by replacing set_xbounds with set_xlim.
You can disable annoying matplotlib debug logging by adding MPL_LGR = logging.getLogger('matplotlib') and MPL_LGR.setLevel(logging.WARNING) to the top of viz.py (after the LGR line).
Maybe move get_spectrum from viz to utils.

Original thermal noise component: comp_029

Same component, orthogonalized: comp_029

dowdlelt commented 5 years ago

@tsalo thanks for the feedback! Thats heaps of information!

Regarding adding bands to the fft plot - I'm of two minds. I like the idea of highlighting areas of typical resting state freqs, but I'm not sure how much value it would add. Perhaps that should just be mentioned in the documentation, that way if someone goes through the effort to use a non default step (--png), they will encounter information there?

Regarding orth - that is a concern. I hadn't messed with that setting, so I had completely missed that. Thanks for bringing it up. I think it may be worth saving the old table in general for purposes of troubleshooting. orig_meicamix.txt or somesuch. Thoughts? I haven't looked through that code yet - could the visualization step just be moved up in the processing stream?

Thanks for all the other information, I would have never sorted out most of that on my on, or even thought to try. I'll start working on this in a bit.

dowdlelt commented 5 years ago

Changes all made, excluding the lingering orth issue - opening pull request

ME-ICA / tedana

Improve generated figures #232

Update: