Closed ahwillia closed 7 years ago
Since this is a behind-the-scenes code update, just let me know when I should check it for face validity (runs without error; produces CPD output), and we can merge.
Btw, does fit_cpd
work on tensors with arbitrary number of dimensions (not just 3)? Specifically, will it compute the PCA factorization if I provide a two-dimensional matrix X
?
It won't give you orthogonal components like PCA. It would also be slower to fit. We could choose to support this, but I don't see the benefit given that matlab already has pca(...)
and svd(...)
functions. This wrapper function mostly exists as a glorified for loop to run multiple optimizations from different initial parameters (not necessary for PCA).
But to be clean and complete we should probably make sure that it works for 4th-order and higher-order tensors.
. We could choose to support this, but I don't see the benefit given that matlab already has pca(...) and svd(...) functions.
One benefit might be to run a non-negative factorization of a 2-dimensional X
. (I don't think that's possible with Matlab pca
or svd
.)
But to be clean and complete we should probably make sure that it works for 4th-order and higher-order tensors.
You could just have a few test cases, where you build a n-th order tensor from factor products then sprinkle a bit of noise, and to see if you more-or-less recover the original factors.
Good idea. But there happens to be a matlab function already for that as well - https://www.mathworks.com/help/stats/nnmf.html
On Sun, Jan 8, 2017 at 8:03 AM, Tony notifications@github.com wrote:
One benefit might be to run a non-negative factorization of a 2-dimensional X. (I don't think that's possible with Matlab pca or svd.)
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/schnitzer-lab/analysis/pull/230#issuecomment-271159963, or mute the thread https://github.com/notifications/unsubscribe-auth/AAm20T9YcLEh2TTdBl2DNGC0Fuwkpajcks5rQQi5gaJpZM4LKYvX .
Thanks for the tip on nnmf
-- didn't know about that one.
By the way, what do you think about altering the visualize_neuron_ktensor
visualization so that (for display purposes) the neuron factors are sorted by weight on a per-factor basis?
This would make neuron axes not comparable across factors, but it's hard to do this anyway given the current visualization. By sorting each factor independently, I could at least get a sense of how many neurons make up a significant membership in a particular factor (e.g. 30 neurons? 80 neurons?).
Happy to clean this up before merging - just let me know in review!
By the way, what do you think about altering the visualize_neuron_ktensor visualization so that (for display purposes) the neuron factors are sorted by weight on a per-factor basis?
Hmm yeah I hadn't thought about that. Have to think a bit more about it, but it seems like a reasonable idea - we should at least give the user the option to do that. I have a couple other changes I'd like to make to visualize_neuron_ktensor
- I'll open a separate PR in the next few days?
Ok, let's punt the visualize_neuron_ktensor
into a separate PR.
Thanks for the detailed feedback @kimtonyhyun - what is your opinion of changing cp_fit
so that it's output is a matrix of structs of size n_replicates x n_rank
. This way you can select, say, all the rank-5 models by:
models = cp_fit(data, ... )
models(:,5)
[models(:,5).error] % list of normalized error
I can add that in to this PR or the next one if you like it.
The matrix organization of outputs looks good to me. If I run models = cp_fit(..., 'min_rank', 5, 'max_rank', 15)
, the first 4 columns of models
would be empty then?
Sure, that sounds like a good convention
Roger that - I did some work on those fixes earlier, but not done yet.
On Jan 12, 2017 7:21 PM, "Tony" notifications@github.com wrote:
Ok, let's punt the visualize_neuron_ktensor into a separate PR.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/schnitzer-lab/analysis/pull/230#issuecomment-272351193, or mute the thread https://github.com/notifications/unsubscribe-auth/AAm20RiKZT0OThwlCOzSYReFMbdQcn-Aks5rRu2cgaJpZM4LKYvX .
Done?
I think so - could you give it a quick look over and test? Apologies if there are still some rough edges.
On Sun, Jan 15, 2017 at 11:34 AM, Tony notifications@github.com wrote:
Done?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/schnitzer-lab/analysis/pull/230#issuecomment-272717873, or mute the thread https://github.com/notifications/unsubscribe-auth/AAm20Yt8TWBXKphSeEdMPLHFBLkw0L3dks5rSnS6gaJpZM4LKYvX .
When I run fit_cpd
now, I find a long and unexplained console dump of s
and best_models
, even though I terminated the fit_cpd
command with a semicolon:
>> [models, best_models] = fit_cpd(Xn, 'min_rank', 1, 'max_rank', 15, 'num_starts', 10);
OPTIMIZATION RUN 1
rank: 1.2.3.4.5.6.7.8.9.10.11.12.13.14.15.
OPTIMIZATION RUN 2
rank: 1.2.3.4.5.6.7.8.9.10.11.12.13.14.15.
OPTIMIZATION RUN 3
rank: 1.2.3.4.5.6.7.8.9.10.11.12.13.14.15.
OPTIMIZATION RUN 4
rank: 1.2.3.4.5.6.7.8.9.10.11.12.13.14.15.
OPTIMIZATION RUN 5
rank: 1.2.3.4.5.6.7.8.9.10.11.12.13.14.15.
OPTIMIZATION RUN 6
rank: 1.2.3.4.5.6.7.8.9.10.11.12.13.14.15.
OPTIMIZATION RUN 7
rank: 1.2.3.4.5.6.7.8.9.10.11.12.13.14.15.
OPTIMIZATION RUN 8
rank: 1.2.3.4.5.6.7.8.9.10.11.12.13.14.15.
OPTIMIZATION RUN 9
rank: 1.2.3.4.5.6.7.8.9.10.11.12.13.14.15.
OPTIMIZATION RUN 10
rank: 1.2.3.4.5.6.7.8.9.10.11.12.13.14.15.
s =
4
best_models =
1x15 struct array with fields:
error
decomp
s =
3
best_models =
1x15 struct array with fields:
error
decomp
s =
6
best_models =
1x15 struct array with fields:
error
decomp
s =
2
best_models =
1x15 struct array with fields:
error
decomp
s =
1
best_models =
1x15 struct array with fields:
error
decomp
s =
8
best_models =
1x15 struct array with fields:
error
decomp
s =
6
best_models =
1x15 struct array with fields:
error
decomp
s =
2
best_models =
1x15 struct array with fields:
error
decomp
s =
2
best_models =
1x15 struct array with fields:
error
decomp
s =
10
best_models =
1x15 struct array with fields:
error
decomp
s =
10
best_models =
1x15 struct array with fields:
error
decomp
s =
6
best_models =
1x15 struct array with fields:
error
decomp
s =
6
best_models =
1x15 struct array with fields:
error
decomp
s =
10
best_models =
1x15 struct array with fields:
error
decomp
s =
1
best_models =
1x15 struct array with fields:
error
decomp
I also find that cpd_scree_plot
errors out:
>> cpd_scree_plot(models);
Undefined function or variable 'ln'.
Error in cpd_scree_plot (line 22)
uistack(ln, 'bottom')
Visualization also fails:
>> visualize_neuron_ktensor(best_models(15).decomp, trial_meta, 'start');
Improper index matrix reference.
Error in visualize_ktensor>format_axes (line 165)
set(Ax(1,f).Title,'String',ttl{f})
Error in visualize_ktensor (line 66)
format_axes(Ax, res.title, res.xlabel, res.XTick, res.XTickLabel);
Error in visualize_neuron_ktensor (line 31)
[Ax, G, BigAx] = visualize_ktensor(decomp, 'c', c, 'plots', plt, ...
Let me know when to re-review. Btw, I'm mostly following along with the instructions on the tensor doc.
Sorry about that - should have known better than to not give it a test. Semicolons are now added and the scree plot is fixed.
I can't reproduce your visualize_neuron_ktensor
bug though... Currently scratching my head on that.
I can't reproduce your visualize_neuron_ktensor bug though... Currently scratching my head on that.
What version of Matlab are you building on?
All right, the basic walkthrough works now:
>> Xn = normalize_tensor(X, trial_meta);
>> [models, best_models] = fit_cpd(Xn, 'min_rank', 1, 'max_rank', 15, 'num_starts', 10, 'method', 'cp_nnals');
OPTIMIZATION RUN 1
fitting models, rank: 1.2.3.4.5.6.7.8.9.10.11.12.13.14.15.
OPTIMIZATION RUN 2
fitting models, rank: 1.2.3.4.5.6.7.8.9.10.11.12.13.14.15.
OPTIMIZATION RUN 3
fitting models, rank: 1.2.3.4.5.6.7.8.9.10.11.12.13.14.15.
OPTIMIZATION RUN 4
fitting models, rank: 1.2.3.4.5.6.7.8.9.10.11.12.13.14.15.
OPTIMIZATION RUN 5
fitting models, rank: 1.2.3.4.5.6.7.8.9.10.11.12.13.14.15.
OPTIMIZATION RUN 6
fitting models, rank: 1.2.3.4.5.6.7.8.9.10.11.12.13.14.15.
OPTIMIZATION RUN 7
fitting models, rank: 1.2.3.4.5.6.7.8.9.10.11.12.13.14.15.
OPTIMIZATION RUN 8
fitting models, rank: 1.2.3.4.5.6.7.8.9.10.11.12.13.14.15.
OPTIMIZATION RUN 9
fitting models, rank: 1.2.3.4.5.6.7.8.9.10.11.12.13.14.15.
OPTIMIZATION RUN 10
fitting models, rank: 1.2.3.4.5.6.7.8.9.10.11.12.13.14.15.
Scree plot works:
>> cpd_scree_plot(models);
Visualization works on Matlab 2015b (previously I was testing on 2013b):
>> visualize_neuron_ktensor(best_models(15).decomp, trial_meta, 'correct');
The PR looks good to me.
Here is how the normalized error varies for CPD models of rank up to 50:
This code was provided by my friend Casey - it is a faster method based on randomized least-squares for fitting CP decompositions. I haven't had the chance to test it thoroughly yet, but feel free to give it a go @kimtonyhyun.
It should operate the same as
cp_als(data, n_components)
-->cprand(data, n_components)