machenslab / dPCA

An implementation of demixed Principal Component Analysis (a supervised linear dimensionality reduction technique)
MIT License
279 stars 94 forks source link

Problems with array shape when running dPCA in Python #31

Closed franjmt closed 4 years ago

franjmt commented 4 years ago

Hi, I constructed the initial neural activity matrix (dF_dff) size (N, S, D, T, E): N = number of neurons S = number of stimuli D = number of decision T = time each trial E = max number of trial per condition

In my case dF_dff shape is (1307, 2, 2, 83, 43), and I take the average on the first dimension (neurons), obtaining dF_dff_average, shape (2, 2, 83, 43). To run the dPCA, I am doing the following thing:

label = 'tsd'
join = [{'s': ['s', 'ts']}, {'d': ['d', 'td']}, {'t'}, {'sd': ['tsd']}]
dpca = dPCA.dPCA(labels=label, join=join, n_components=2, regularizer='auto')
dpca.protect = ['t']
Z = dpca.fit_transform(dF_dff_average, dF_dff)

But this doesn't work. I get an index error:

IndexError: index 43 is out of bounds for axis 3 with size 43

Maybe I am doing something wrong. Hopefully you can help me figuring out what's the problem Thank you

dkobak commented 4 years ago

The average over neurons does not make any sense to me. Shouldn't it be the average over trials, i.e. the last dimension?

franjmt commented 4 years ago

Oh yes, I did the average over trials first, but still I am having the same issues. I think I got confuse with the python demo, because the trial is on axis 0, but the Matlab has it on axis 4. Regardless of that error, if I use:

dF_dff_average = np.nanmean(dF_dff, axis=4) dF_dff_average.shape (1307, 2, 2, 83) dF_dff.shape (1307, 2, 2, 83, 43)

dkobak commented 4 years ago

Well, if the Python code expects the trials to be on axis 0 then you should reorder your dimensions to make it such.

franjmt commented 4 years ago

Still if I move the trials to axis 0 and average over trial, still doesn't run.

dkobak commented 4 years ago

Please post the new shapes of your arrays and the error message you get.

franjmt commented 4 years ago
dF_dff.shape
(43, 1307, 2, 2, 83)

dF_dff_average.shape
(1307, 2, 2, 83)

label = 'tsd'
join = [{'s': ['s', 'ts']}, {'d': ['d', 'td']}, {'t'}, {'sd': ['tsd']}]

dpca = dPCA.dPCA(labels=label, join=join, n_components=2, regularizer='auto')
dpca.protect = ['t']

Z = dpca.fit_transform(dF_dff_average, dF_dff)

Error: ValueError: operands could not be broadcast together with shapes (1307,2,83,2) (1307,2,2,1)

dkobak commented 4 years ago

Okay, the neuron dimension should come 1st (after trials but before time/stim/decision). Please look in the demo!

franjmt commented 4 years ago

In the demo the time is in axis 2 of R data

# number of neurons, time-points and stimuli
N,T,S = 100,250,6

R.shape
(100, 6, 250)
dkobak commented 4 years ago

Right, yes, the dimensions seem fine now...

If time is last, label = 'tsd' should be 'sdt' probably. Not sure this can explain the error though.

franjmt commented 4 years ago

It worked. Thank you very much!

dkobak commented 4 years ago

Great!