Closed shaistamadad closed 3 years ago
Here X_init
is a dimensionality reduction to use as initialization for the GPLVM training. In the iPSC dataset we used the PCA dimensions, so you could try rerunning PCA with the same number of factors as what you are giving in input to the model.
sc.pp.pca(adata, n_comps=d)
adata.obsm['X_init'] = adata.obsm['X_pca'].copy()
It would also be very interesting to assess how important this initalization is in these datasets: if you train with X_init=None
do you get the same/similar latent factors?
ValueError: k must be between 1 and min(A.shape), k=4999; Running PCA with comps=d gives this error for all datasets. I think that's because d is the number of columns in the Y object which is larger than the number of rows: (n, d), q = Y.shape, 6; I tried using n-1 but that takes a really long time; also tried smaller values such as 50?
and get the same error as before: Sizes of tensors must match except in dimension 1. Expected size 7 but got size 50 for tensor number 1 in the list.
also, what's the difference between using sc.pp.pca() and sc.tl.pca()?
Aah my bad sorry: the number of latent dimensions for the model is q
not d
! in the iPSC dataset X_init
has 22188 rows (= number of cells, adata.n_obs
) and 7 columns. I suspect you need to give in input the number of dimensions specified in the model (i.e. q
in model = GPLVM(n, d, q, n_inducing=64, period_scale=np.pi, X_init=adata.obsm["X_init"])
) + 1 for the periodic kernel/cell cycle latent variable. So use sc.pp.pca(adata, n_comps=q+1)
.
This should work alright:
import scvelo as scv
import scanpy as sc
d=6
adata = scv.datasets.pancreas()
sc.pp.pca(adata, n_comps=d+1)
As far as I know there is no difference between sc.pp.pca
and sc.tl.pca
.
worked! actually my bad too, I was running the PCA after running the model!
X_init=adata.obsm["X_init"])
the pancreas, gastrulation and most other scvelo datasets don't have the X_init data structure in obsm slot, They contain obsm: 'X_pca', 'X_umap', 'X_tsne' etc..
I have been trying to use X_umap instead of X_init but get errors like these: Sizes of tensors must match except in dimension 2. Got 50 and 7 (The offending index is 0)