Kimth/eval_fit - Githubissues

@ahwillia cc @forea Please review but do not merge.

As we've discussed in the past, I am interested in inspecting and evaluating the quality of dimensionality reduction results at the single neuron level. This PR introduces a new script -- called eval_fit -- that can be used for this purpose. (You will have to add the top-level directory eval_fit to the Matlab path.) While the immediate interest is in evaluating the quality of CPD factorization models, I designed the tool to be generic. Using this tool, we could for instance also investigate the fidelity of other dimensionality reduction methods such as PCA.

Here is a walkthrough on its use, using CPD factorization of c11m1d15 as an example. I used rank=15, non-negative CPD fitting. I uploaded some pre-computed MAT files, so that you can follow along with this tutorial without running the trace export from MultiDay and CPD fitting from scratch:

export-align.mat (60 MB): This MAT file contains the raw data tensor X i.e. the fluorescence traces organized as [num_neurons x intra-trial samples x num_trials]. Also includes info and trial_meta (i.e. the auxiliary outputs of MultiDay.export).
export-align_cpd_rank15.mat (55 MB): This MAT file contains the "fitted" / reconstructed data tensor Xest (with same dimensions as X). Also includes the cpd struct from the fit.

Please download both files and load to the Matlab workspace to proceed with the walkthrough.

The basic idea of eval_fit is to perform comparisons between the tensors X and Xest. Any dimensionality reduction method (e.g. CPD, PCA) can be expanded into the full neural space (i.e. expand to Xest) so eval_fit can generically be used to inspect the fidelity of different dimensionality reduction methods.

To use eval_fit, we begin by pre-computing "fit quality scores" between X and Xest:

>> scores = compute_all_scores(X, Xest)

scores = 
3x1 struct array with fields:
    name
    sort_order
    vals
    vals_n

Note that scores is a struct array of length 3. Initially, I used the R squared as the measure of fit between the traces of X and Xest. However, it was apparent to me that other metrics (such as residuals) were also interesting, and could capture different aspects of the comparison between X and Xest. Hence, the scores array currently implements 3 different measures of fit quality.

For example, scores(1) is the R-squared measure between the traces of X and Xest:

>> score = scores(1)

score = 
          name: 'Rsq'
    sort_order: 'descend'
          vals: [487x150 double]
        vals_n: [487x1 double]

The fields of score are as follows:

score.name is a string that indicates the name of the fit measure.
score.vals is a [num_neurons x num_trials] matrix, where score.vals(i,j) is the R squared value between the real and fitted traces of neuron i, trial j.
score.vals_n is a [num_neurons x 1] matrix, where score.vals_n(i) is the R squared value between the real and fitted traces of neuron i over all trials.
score.sort_order is one of 'descend' or 'ascend', indicating the numerical sort direction to order the fit measure from best to worst. (For R squared, it is 'descend' since sort(..., 'descend') would order the R squared values from best to worst.)

Similarly, scores(2) is the residual between traces of X and Xest:

>> score = scores(2)

score = 
          name: 'Residual'
    sort_order: 'ascend'
          vals: [487x150 double]
        vals_n: [487x1 double]

Note that the sort_order for residuals is 'ascend' since smaller values of the residual indicate a better fit.

Finally, scores(3) is simply the peak-to-peak amplitude of the raw traces (i.e. in X):

>> score = scores(3)

score = 
          name: 'Peak-to-peak'
    sort_order: 'descend'
          vals: [487x150 double]
        vals_n: [487x1 double]

The "Peak-to-peak score" isn't a measure of fit between X and Xest at all, but I still found it convenient to have this "score" during the use of eval_fit.

After pre-computing the scores array, you can invoke the eval_fit script as follows:

>> eval_fit(X, Xest, scores);
Evaluator (Rsq, Cell 1 of 487) >>

which shows the following plot: c11m1d15_evalfit_cell1

The interpretation of the plot is as follows:

Top left panel shows the raw (blue) and fitted (red) trace over all trials.
Top right panel shows a scatterplot between fitted (x-axis) and raw (y-axis) fluorescence values. Dashed black line is the identity diagonal. The title indicates the currently active score (Rsq) for the neuron over all trials.
The next 10 subplots show individual trials with the best fit score (Trials 40, 29, ..., 142). Raw data is shown in blue, the fitted trace is shown as a green line.
The final 10 subplots show individual trials with the worst fit score (Trials 57, 32, ..., 125). Raw data is shown in blue, and the fitted trace is shown as a red line.

You can jump to another cell by typing in the cell number:

Evaluator (Rsq, Cell 1 of 487) >> 4
Evaluator (Rsq, Cell 4 of 487) >>

which updates the plot accordingly: c11m1d15_evalfit_cell4

You can also increment the cell index by 1, by just pressing Enter in the prompt:

Evaluator (Rsq, Cell 4 of 487) >> 
Evaluator (Rsq, Cell 5 of 487) >> 
...
Evaluator (Rsq, Cell 9 of 487) >> 
Evaluator (Rsq, Cell 10 of 487) >>

which shows: c11m1d15_evalfit_cell10

Note that the currently active measure of fit is 'Rsq', as indicated in the command prompt. Type 's' to see the list of all available scores:

Evaluator (Rsq, Cell 10 of 487) >> s
  Available scores:
  1: Rsq
  2: Residual
  3: Peak-to-peak

You can switch into a different score type (for example, the residual) as follows:

Evaluator (Rsq, Cell 10 of 487) >> s2
Evaluator (Residual, Cell 10 of 487) >>

c11m1d15_evalfit_cell10_residual

Note that the trials that are best or worst under one measure is not necessarily the best or worst in another measure. In particular, the residual measure is effective at highlighting trials that have a large absolute discrepancy between raw and fitted trials.

The Peak-to-peak "score" allows me to look at the quality of the fit for trials where there is a large / small trace amplitude:

Evaluator (Residual, Cell 10 of 487) >> s3
Evaluator (Peak-to-peak, Cell 10 of 487) >>

c11m1d15_evalfit_cell10_amplitude

While "Peak-to-peak" isn't really a measure of fit between X and Xest, it allows me to quickly inspect that the fits are valid for trials where the actual trace is particularly active or silent. (For instance, it could be the case that a fit predicts a constant pattern for all trials, even when the raw data is silent. I would consider this to be a fairly offensive case of the fitting model "making up features" not present in the data.)

You can exit the interaction loop by typing 'q':

Evaluator (Peak-to-peak, Cell 10 of 487) >> q
>>

The sort_scores function is a convenience method that sorts the neuron indices in order of fit quality (best to worst).

For example, to sort neurons by their R squared score:

>> scores(1)

ans = 
          name: 'Rsq'
    sort_order: 'descend'
          vals: [487x150 double]
        vals_n: [487x1 double]

>> sorted_scores = sort_score(scores(1));

yields a [num_neurons x 2] matrix sorted_scores that indicates the neuron index and the corresponding R squared values, ordered from best to worst:

sorted_scores =

   69.0000    0.9030
  361.0000    0.9024
  360.0000    0.9017
  486.0000    0.8942
  441.0000    0.8848
...
  371.0000    0.0686
   58.0000    0.0678
   71.0000    0.0572
   11.0000    0.0389
   36.0000    0.0318

I found it convenient to have the sorted_scores displayed in a Matlab Variables window on the side, while I inspect cells in eval_fit.

@ahwillia I created a helper function group_fits which groups cells based on their fit score to one of: "good", "okay", "bad".

With X and Xest loaded into the workspace:

>> scores = compute_all_scores(X, Xest);
>> [good, okay, bad] = group_fits(scores(1));

yields:

good: IDs of cells with R squared value in [0.7, 1];
okay: IDs of cells with R squared value in [0.4, 0.7);
bad: IDs of cells with R squared value in [0, 0.4).

The thresholds of 0.4 and 0.7 are hard-coded in group_fits. These values are just a starting point, and certainly aren't set in stone.

Accordingly, the eval_fit script can be invoked with optional varargin'cells' to browse a subset of the cells. For example, to evaluate the "good" cells only:

>> eval_fit(X, Xest, scores, 'cells', good);
Evaluator (Cell 4, Rsq=0.7082) >>

Aside: You may not be well are of the browse_rasters function for inspecting single-cell rasters of a given DaySummary, but it also has a similar 'cells' optional input:

>> browse_rasters(m1d15, 'cells', good);
Raster browser (page 1 of 10) >>

c11m1d15_good-cells

Added a new score, called "Signed diff", which is essentially sum(X-Xest,2). (Recall that 2nd dimension is the intra-trial time.) Note that unlike the residual, this score is signed. In particular:

If the signed diff is positive, this means that the fit (Xest) underestimates the data (X).
If the signed diff is negative, this means that the fit overestimates the data (or, in other words, predicts "spurious activity").

The "Signed diff" score allows me to look for "spurious activity" in the fitting:

>> scores = compute_all_scores(X, Xest);
>> eval_fit(X, Xest, scores);
Evaluator (Cell 1, Rsq=0.6333) >> 209
Evaluator (Cell 209, Rsq=0.5398) >> s4
Evaluator (Cell 209, Signed diff=-2783.6319) >>

yields the following plot: c11m1d15_cell209_signed-diff where the green traces are the trials for which the fit underestimates the data, and red traces are those where the fit "overestimates" the data.

@ahwillia I added a small function fit_nnmf.m which:

Unwraps the data tensor X into a 2D matrix (call it Xu),
Factor Xu via nnmf into rank-r terms,
Reconstructs the data tensor Xest based on the NMF factorization.

I highlight the function here, so that you can evaluate its correctness. I'm writing an initial comparison to the CPD approach is coming as a separate issue.

I find the neural unfolding implemented here to be the most intuitive to me personally (i.e. just "unwrapping" and "rewrapping" in time). I made this visualization to help me get a feel for the data layout for all six possible unwrappings:

unfoldings

@ahwillia Updated fit_nnmf.m so that it supports unfoldings along neurons (unfold_dim=1) or trials (unfold_dim=3).

Note: I use the term "fast" and "slow" dimensions to describe the ordering of the data points along the rows of the unfolded matrix Xu. In my nomenclature, the "fast" dimension is the dimension that changes "more quickly" as you read the entries of Xu(k,:) from left-to-right.

For example, in both of the unfolded matrices below, the fast dimension is the intra-trial time:

Another metric that we may consider comparing between NMF and CPD is the similarity of the factorization results for multiple instantiations of the fit. I just ran NMF factorization on the neuron unfolding (to make sure I didn't break the code), and it seemed that the set of factors seemed somewhat "different" than the first set that I analyzed in #235.

Here is a quick visualization of the trial loadings obtained by NMF on trial-unfolded X. I used rank=15 as usual, and the green/red color code on the far right corresponds to trial_meta.correct: c11m1d15_trial_unfold_raster

Based on this raster, it seemed to me that the k=7 factor may correspond to trial correctness. I then looked at the trial loading values for this factor, where the markers are color coded by correctness: c11m1d15_trial_unfold_k7

Some observations:

Confirmed that the k=7 factor does correspond to trial correctness, as obtained via NMF of the trial-unfolded data tensor.
Also observe "integration" type of phenomenon for this factor dynamics. Similar to what we have seen in some corresponding CPD factors.
I find it interesting that, for the first two error trials (Trials 5 and 8), there is no "activation" of this factor despite nominally getting the trial wrong. I've previously pointed out that in these trials the mouse may be "knowingly making a mistake", i.e. this type of error factor might distinguishes between expected and unexpected errors. (Additionally, there is evidence that some single cells show this type of distinction as well.)

kimtonyhyun / analysis

Kimth/eval_fit #231