theislab / scvelo

RNA Velocity generalized through dynamical modeling
https://scvelo.org
BSD 3-Clause "New" or "Revised" License
414 stars 102 forks source link

how to subset the scevelo object based on barcodes or genes? #215

Closed KoichiHashikawa closed 4 years ago

KoichiHashikawa commented 4 years ago

Hello Theis lab,

I am new to Scevelo. I have just generated loom files that can be successfully read using scevelo.

We would like to subset the objects based on either cellular-barcodes or genes as we have already done analysis using Seurat and have an idea which cells we like to study RNA velocity. Could you guide us how to subset the object?

In addition, we also already have meta data such as cluster types for each cell and are wondering how to add those metadata info onto the object.

thanks so much in advance. Koichi

VolkerBergen commented 4 years ago

Hi and welcome,

you can subset the object via adata = adata[list_of_barcodes].copy() or adata = adata[adata.obs['clusters].isin(['cluster1', 'clusters2'])].copy().

The cluster annotations are to be read with clusters = scv.load('filename'), which can stored under adata.obs['clusters'].

Let me know if you have any questions.

KoichiHashikawa commented 4 years ago

Thanks so much for the advice! I really appreciate it.

ruohuchengxhe commented 3 years ago

Hi @VolkerBergen , Thanks for developing such a nice tool! May I ask if subsetting the way above results in the same results as scv.utils.merge() as mentioned in #205 and #161? So, this might be stupid, I have a dataset, which was calculated already. Then I generated a loom file for the subset cells using as.loom() by seurat following #161, then read as adata_sub and merged the two for a smaller dataset. Then I calculated for the adata_sub again, but it shows exactly the same plot as what can be seen in the big dataset if I keep the UMAP coordinates with the subsetted cells. (change when I change the UMAP of course..) So my other question is, is calculating on the small dataset going to be different with the 'zoom in' of the whole dataset? I'm expecting a new trajectory flow since I have calculated again, I suppose cells around them are different now since some cells are not present in the subset anymore.
Thanks in advance!

denvercal1234GitHub commented 1 year ago

@ruohuchengxhe -- did you figure out whether doing what VolkerBergen suggested the same as using utils.merge?