Fit trial factors using existing cell/temporal factors

neurostatslab / tensortools

A very simple and barebones tensor decomposition library for CP decomposition a.k.a. PARAFAC a.k.a. TCA

MIT License

160 stars 65 forks source link

Fit trial factors using existing cell/temporal factors #20

Closed klmcguir closed 5 years ago

klmcguir commented 5 years ago

It would be great to be able to fit one dimension of a tensor decomposition by fixing two other dimensions of an existing TCA model. I'm not even sure if that still qualifies as tensor decomposition or just a regression problem, but it would be very useful as cross-validation for biological effects observed from TCA.

i.e., You see a cool affect across time in a large dataset of neuron calcium imaging data. You want to know how perturbation experiments interleaved across time affect your TCA components, but don't want to use those perturbation data points to train the TCA model. So you fit all of the data (minus perturbation trials) and use that model to refit trial factors for your perturbation data.

ahwillia commented 5 years ago

Should be easy to incorporate this! Just to make sure you are asking for something like the following?

data = # ndarray with shape, e.g., units x time x trials
first_model = tt.cp_als(data, rank=R)

full_data = np.concatenate((data, perturb_trials), axis=-1)
second_model = tt.cp_als(full_data, rank=R, init=first_model, skip_modes=[0, 1])

klmcguir commented 5 years ago

Yes! Exactly. If 'skip_modes' are referring to the first two dimensions (i.e., cell factors and temporal factors) then this is perfect!

I mostly use the 'mncp_hals' algorithm so I guess my preference would be to start there!

As always, thanks so much for your help!

ahwillia commented 5 years ago

Done in https://github.com/ahwillia/tensortools/commit/215f1e8ec4c8f6395515657a8a7a1d3c08662d2b

Here is a minimal example. Note that you need to do a bit of work by defining your initial model (Uext) exactly. It should work for any fitting method:

import tensortools as tt
import numpy as np

# Make synthetic dataset.
I, J, K, R = 25, 25, 25, 4  # dimensions and rank
X = tt.randn_ktensor((I, J, K), rank=R).full()
X += np.random.randn(I, J, K)

# Fit CP tensor decomposition to first 20 trials.
U = tt.cp_als(X[:, :, :20], rank=R, verbose=True)

# Extend and re-initialize the factors along the final mode.
Uext = U.factors.copy()
Uext.factors[-1] = np.random.randn(K, R)
Uext.shape = (I, J, K)

# Fit model to the full dataset, only fitting the final set of factors.
V = tt.cp_als(X, rank=R, init=Uext, skip_modes=[0, 1], verbose=True)