Open thecatsmilk opened 1 week ago
You need to convert numpy array to scipy sparse matrix. You can do that by adata.X = csr_matrix(adata.X[:])
.
@kaizhang
Thanks for the reply! I did discover that you can extract the matrix how you show above (admittedly by accident). I just tried this:
# convert to scipy sparse matrix
atac.X = sp.csr_matrix(atac.X[:])
# perform joint embedding
embedding = snap.tl.multi_spectral(
adatas=[rna, atac],
features=None
)[1]
Unfortunately I am still met with the same error:
2024-10-10 15:36:24 - INFO - Compute normalized views...
thread '<unnamed>' panicked at src/embedding.rs:411:13:
called `Result::unwrap()` on an `Err` value: Cannot convert Array(F64) to f64 CsrMatrix
---------------------------------------------------------------------------
PanicException Traceback (most recent call last)
Cell In[32], line 1
----> 1 embedding = snap.tl.multi_spectral(
2 adatas=[rna, atac],
3 features=None
4 )[1]
File /home/.conda/envs/scSnapATAC2_v2024.08/lib/python3.12/site-packages/snapatac2/tools/_embedding.py:533, in multi_spectral(adatas, n_comps, features, weights, random_state, weighted_by_sd)
530 if weights is None:
531 weights = [1.0 for _ in adatas]
--> 533 evals, evecs = internal.multi_spectral_embedding(adatas, features, weights, n_comps, random_state)
535 if weighted_by_sd:
536 idx = [i for i in range(evals.shape[0]) if evals[i] > 0]
PanicException: called `Result::unwrap()` on an `Err` value: Cannot convert Array(F64) to f64 CsrMatrix
What I did next was basically build a new AnnData object after extracting all the elements, like so:
# rebuild AnnData
obs_df = pd.DataFrame({
'sample': atac.obs['sample'],
'leiden_0.05': atac.obs['leiden_0.05'],
'leiden_0.1': atac.obs['leiden_0.1'],
'leiden_0.2': atac.obs['leiden_0.2'],
'leiden_0.35': atac.obs['leiden_0.35'],
'leiden_0.4': atac.obs['leiden_0.4'],
'leiden_0.6': atac.obs['leiden_0.6'],
'leiden_0.8': atac.obs['leiden_0.8'],
'leiden_1.0': atac.obs['leiden_1.0'],
'leiden_1.2': atac.obs['leiden_1.2'],
'leiden_1.4': atac.obs['leiden_1.4'],
'ATAC': atac.obs['ATAC'],
'cellType': atac.obs['cellType'],
'barcodes_clean': atac.obs['barcodes_clean'],
'barcodes_orig': atac.obs['barcodes_orig'],
'RNA': atac.obs['RNA']
})
obs_df.index = atac.obs_names
var_df = pd.DataFrame({
'count': atac.var['count'],
'selected': atac.var['selected']
})
var_df.index = atac.var_names
# build AnnData
new_atac = ad.AnnData(X=X_f64, obs=obs_df, var=var_df)
new_atac
Where X_f64
was created like this:
from scipy import sparse as sp
X = atac.X[:]
X = sp.csr_matrix(X)
X_f64 = X.astype('float64')
With new_atac
I tried to co-embedding again:
embedding = snap.tl.multi_spectral(
adatas = [rna, new_atac],
features = rna.var['selected']
)[1]
It worked! I move to re-running UMAP:
new_atac.obsm['X_joint'] = embedding
snap.tl.umap(new_atac, use_rep='X_joint')
Unfortunately I am met with this error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[136], line 2
1 new_atac.obsm['X_joint'] = embedding
----> 2 snap.tl.umap(new_atac, use_rep='X_joint')
File /home/.conda/envs/scSnapATAC2_v2024.08/lib/python3.12/site-packages/snapatac2/tools/_embedding.py:113, in umap(adata, n_comps, use_dims, use_rep, key_added, random_state, inplace, **kwargs)
108 if use_dims is not None:
109 data = data[:, :use_dims] if isinstance(use_dims, int) else data[:, use_dims]
111 umap = UMAP(random_state=random_state,
112 n_components=n_comps,
--> 113 **kwargs).fit_transform(data)
114 if inplace:
115 adata.obsm["X_" + key_added] = umap
File /home/.conda/envs/scSnapATAC2_v2024.08/lib/python3.12/site-packages/umap/umap_.py:2891, in UMAP.fit_transform(self, X, y, force_all_finite)
2855 def fit_transform(self, X, y=None, force_all_finite=True):
2856 """Fit X into an embedded space and return that transformed
2857 output.
2858
(...)
2889 Local radii of data points in the embedding (log-transformed).
2890 """
-> 2891 self.fit(X, y, force_all_finite)
2892 if self.transform_mode == "embedding":
2893 if self.output_dens:
File /home/.conda/envs/scSnapATAC2_v2024.08/lib/python3.12/site-packages/umap/umap_.py:2358, in UMAP.fit(self, X, y, force_all_finite)
2356 X = check_array(X, dtype=np.uint8, order="C", force_all_finite=force_all_finite)
2357 else:
-> 2358 X = check_array(X, dtype=np.float32, accept_sparse="csr", order="C", force_all_finite=force_all_finite)
2359 self._raw_data = X
2361 # Handle all the optional arguments, setting default
File /home/.conda/envs/scSnapATAC2_v2024.08/lib/python3.12/site-packages/sklearn/utils/validation.py:1096, in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_writeable, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator, input_name)
1094 n_features = array.shape[1]
1095 if n_features < ensure_min_features:
-> 1096 raise ValueError(
1097 "Found array with %d feature(s) (shape=%s) while"
1098 " a minimum of %d is required%s."
1099 % (n_features, array.shape, ensure_min_features, context)
1100 )
1102 if force_writeable:
1103 # By default, array.copy() creates a C-ordered copy. We set order=K to
1104 # preserve the order of the array.
1105 copy_params = {"order": "K"} if not sp.issparse(array) else {}
ValueError: Found array with 0 feature(s) (shape=(6333, 0)) while a minimum of 1 is required.
Then—under closer inspection—I realized embedding
has no features:
embedding
array([], shape=(6333, 0), dtype=float64)
Any ideas on how to proceed from here?
I think it is the 'rna' matrix that causes the problem.
I am following this tutorial to jointly embed my single-nucelus multiome capture from 10X. I have already filtered both RNA and ATAC objects to the same cells, and made sure that the barcodes are identical, then I process with the embedding:
Once running the above cell I am met with this error:
It seems like the issue is that
atac.X
is aPyArray Element
. Keep in mindatac
is an AnnDataSet object that I processed with SnapATAC2 and then converted to an AnnData object in order to co-embed these ATAC data with the RNA data from the same capture.I tried many ways to convert the PyArray to a Numpy array in order to have it be compatible, but I was unsuccessful. It seems as if I could do this if I went into Rust to modulate this PyArray, but I have no experience in Rust. Any help with this would be greatly appreciate it. Thank you!