Closed RM-SCB closed 4 years ago
Hi and welcome,
that looks alright so far. In fact, quite profound for a python beginner. You can run the counting pipeline on all data and then subset to the cells of interest. Precomputed clusters and embeddings can be passed as follows.
adata.obs['clusters'] = array_of_clusters
adata.obsm['X_umap'] = np.stack([UMAP1, UMAP2]).T
scvelo does not do any cell filtering per default. Only filtering of genes that pass a minimum number of counts and are highly variable.
Let me know if other questions arise.
Thanks for your prompt response! I think I have managed to make some progress and get started with the actual scvelo analysis. So far here is what I have done:
velocyto
(>7000 cells)as.loom
by seurat (filtered object of ~2000 cells which contains umap, pca, cluster infos, metadata...)sceMerge = scv.utils.merge(adata, sce_data)
Now, one question that arise is whether I should be performing the following pre-processing steps on the merged object:
scv.pp.filter_and_normalize(adata, min_shared_counts=30, n_top_genes=2000)
scv.pp.moments(adata, n_pcs=30, n_neighbors=30)
Or should I go straight to this one for example?
scv.tl.velocity(sceMerge, mode='stochastic')
Many thanks for your help!
Hi @Mevelo , I somehow figure out the problem you mentioned and I have successfully merged two loom files using the method you proposed on this post. I went straight to running scv.tl.velocity(sceMerge, mode='stochastic') and it turns out that the program will normalize count data automatically. So my suggestion here is that after merging two loom files, you need to run scv.pp.filter_and_normalize(adata, min_shared_counts=30, n_top_genes=2000) but the scv.pp.moments(adata, n_pcs=30, n_neighbors=30) is optional. Later you can use the precomputed embedding coordinates to visualize data
Hi @Mevelo, as @ziyangliu93 said you'd definitely need to normalize your counts. These preprocessing steps are always needed
scv.pp.filter_and_normalize(adata, min_shared_counts=30, n_top_genes=2000)
scv.pp.moments(adata, n_pcs=30, n_neighbors=30)
In fact, if not done by the user already, norm. and moments are computed automatically before velocity estimation. So even if the user forgets about it, it will be processed correctly.
Just as a side note: umap, clustering etc., all of these can also be computed directly within scvelo, e.g. scv.tl.umap
, scv.tl.louvain
.
Fantastic, thank you both for your replies!
Hi Mevelo and Volker
When you state that you "Import loom file generated using as.loom by seurat (filtered object of ~2000 cells which contains umap, pca, cluster infos, metadata...)", did you import the seurat object a rds? or you converted the seurat object loom first. I am getting an error whenever I attempt to import the Seurat rds object I generated. Any advice will be deeply appreciated. thank you
scv.read
understands the following formats:
h5ad
, loom
, xlsx
, csv
, tsv
, tab
, data
, txt
, mtx
, mtx.gz
, soft.gz
.
rds
is R-specific and needs to be converted to one of these, e.g. loom.
Also see here if you run into Seurat/Loom issues.
Hello,
Apologies if my questions sound trivial but I am completely new to python, and I am a biologist before being an occasional R user (so far... but hopefully will be able to play with python soon as well!). I would be interested in running
scVelo
on my data but I am not quite sure that I am doing the right things, and I am now feeling a little bit stuck.velocyto
on my output from10x
(all cells classified as such by cellranger), which has successfully generated aloom
file.Seurat
, and I now have a filtered object, with only the group of cells I am interested in, which also contains clustering information, PCA and UMAP coordinates.My first question is: is this the correct approach, or should
velocyto
be run on filtered data based on the cell barcodes that I am interested in only (ie filtered cells obtained at the end of my seurat analysis), or perhaps it doesn't matter and this will be filtered withinscVelo
?My second question is: how should I import the dimension reductions (PCA and UMAP), clusters information, and additional metadata (like sample name), into my scVelo object (
adata
)?Any help would be much appreciated!