Open archana433 opened 4 months ago
I've used it and it worked well! You have to do some manual preprocessing because of duplicated features.
I've discussed it here: https://github.com/broadinstitute/CellBender/issues/234
My most recent code for summing features in a non memory hungry way is:
import scanpy as sc
import numpy as np
import anndata
from scipy.sparse import csr_matrix
import sys
path = sys.argv[1]
adata = sc.read_10x_h5(path+"/sample_raw_feature_bc_matrix.h5")
var = adata[:, adata.var_names.duplicated()].var[~adata[:, adata.var_names.duplicated()].var.index.duplicated(keep='first')]
adata_x=csr_matrix(np.concatenate([adata[:, n].X.sum(axis=1) for n in var.index], axis=1))
double_probes = anndata.AnnData(X=adata_x, obs=adata.obs, var=var)
final=anndata.concat([adata[:, ~(adata.var.index.isin(double_probes.var.index))], double_probes], axis=1)
del adata
adata_filtered = sc.read_10x_h5(path+"/sample_filtered_feature_bc_matrix.h5")
adata_filtered_feature = final[:, final.var.gene_ids.isin(adata_filtered.var.gene_ids)].copy()
adata_filtered_feature.write(path+"/my_feature_filtered_file.h5ad")
Hi, I just want to ask , can we use CellBender for Single Cell Gene Expression Flex Fixed RNA Profiling (FRP) seq samples to remove backgroud noise / Ambient RNA / empty Droplets because this seq is probe based sequencing.
Thank you