Closed GrigoriiNos closed 3 years ago
@GrigoriiNos, are you running this as the very first step of your pipeline, i.e. on the raw data immediately after reading the AnnData object?
@WeilerP Thanks for the fast response I'm running this right after AnnData.concatenatee all loom files into one AnnData object
Per batch ratios are the same
@GrigoriiNos could you please provide a code snippet leading to the plots? I presume it is something along the lines of
import scanpy as sc
import scvelo as scv
adata_1 = sc.read(PATH_TO_DATA)
# Some concatenation
scv.pl.proportions(adata)
And these counts are raw, i.e. the output of e.g. velocyto s.t. your AnnData objects contain the layers 'unspliced'
, 'spliced'
, 'ambiguous'
?
There is a typo here s.t. the ambiguous layer is not picked up should it exist. If it exists and is called "ambiguous"
, could you please also provide the plots for scv.pl.proportions(adata=adata, layers=['unspliced', 'spliced', 'ambiguous'])
?
I've done basic scRNA-seq analysis in Seurat and transferred metadata to AnnData manually
### some metadata from Seurat analysis
meta = pd.read_csv('metadata.tsv', sep='\t')
### read looms
ldata_batch1 = scv.read('batch1_loom', cache=False)
ldata_batch1.var_names_make_unique()
ldata_batch2 = scv.read('batch2_loom', cache=False)
ldata_batch2.var_names_make_unique()
### (the same for all other batches)
### replace prefix with batch names in barcodes
cells_batch1 = pd.Series(ldata_batch1.obs.index).str.replace('possorted_genome_bam_.*\\:', 'batch1_')
ldata_batch1.obs.index = cells_batch1
ldata_batch1=ldata_batch1[ldata_batch1.obs.index.isin(meta.index)]
cells_batch2 = pd.Series(ldata_batch2.obs.index).str.replace('possorted_genome_bam_.*\\:', 'batch2_')
ldata_batch2.obs.index = cells_batch2
ldata_batch2=ldata_batch1[ldata_batch2.obs.index.isin(meta.index)]
### (the same for all other batches)
ldata = ldata_batch1.concatenate(ldata_batch2, ....# other batches
)
#### and now plot it
scv.pl.proportions(ldata)
################
###This is what I get with
scv.pl.proportions(adata=ldata, layers=['unspliced', 'spliced', 'ambiguous'])
I've done basic scRNA-seq analysis in Seurat and transferred metadata to AnnData manually
@GrigoriiNos, just to clarify: does this mean the counts are not raw (e.g. already normalized and filtered) or that you simply transferred e.g. cluster labels to the AnnData object? Also, did you verify manually (i.e. w/o using scVelo) that the proportions are as you'd expect?
@WeilerP I didn't even transfer count matrix, I only assigned my Seurat @meta.data to anndata.obs and umap coordinates obsm['X_umap'], I was only going to use velocity in python
Yes all the cells are subset according to what I filtered analysing the data in R
How can I check this proportions manually?
Yes all the cells are subset according to what I filtered analysing the data in R
If you already filtered cells/genes in R, the proportions will/can change, right?
How can I check this proportions manually?
To check proportions manually, you'll need to calculate them yourself in Python or R.
I don't filter genes, I don't think I need to
I'd be shocked if filtering out bad QC cells would change this proportion that drastically,
Here are raw proportions, without any filtering
@GrigoriiNos, you need to check the proportions yourself to verify that this is indeed a problem caused by scVelo (i.e. we are calculating them wrong) and not already present in your data. I cannot possible infer the problem by simply looking at plots.
@GrigoriiNos, I am closing this issue for now. Feel free to reopen once you checked that proportions are as expected in your AnnData object.
Dear scVelo developers,
First off, thanks a lot for your amazing work and this cool software!
I encountered an issue while working with scVelo. I'm trying to do velocity analysis for CD4 t cells. In all my samples, according to web_summaries from cell ranger, an average percent of exonic reads is around 70% and intronic is 8%, which would suggest that I should have way more spliced transcripts than unspliced,
whereas when running scv.pl.proportions() function, I get these statistics.
Could you please tell me what could possibly go wrong?
Thank you!
Grigorii