wguo-research / scCancer

A package for automated processing of single cell RNA-seq data in cancer
92 stars 39 forks source link

New feature request: multimodal data? #7

Closed lagzxadr closed 4 years ago

lagzxadr commented 4 years ago

Dear Wenbo,

Thanks for sharing the wonderful work. It is really convenient to use for scRNA-seq data. But sometimes, my data is multimodal, which has both gene expression and antibody capture results. In that case, I got error:

Error in Matrix::colSums(expr.data > 0) : (list) object cannot be coerced to type 'double'

from

stat.results <- runScStatistics(
        dataPath = dataPath,
        savePath = savePath,
        sampleName = sampleName,
        authorName = authorName,
        genReport = T
    )

I am wondering if any augment I can use to specify only gene expression is used when data loaded to seurat object. Or is it possible to change your code a bit to make it compatible to expr.data as well as expr.data[["Gene Expression"]] ?

Thanks and happy Chinese New Year! Xiaojing

wguo-research commented 4 years ago

Thanks for your attention and suggestion! Currently, scCancer is not compatible to the multi-modal data. I am willing to update the codes to support it. But I am not familiar to the data structure of multi-modal data. Could you please describe the data storage form in more detail? Or it would be better if you could give me an example data. Just showing the form is OK, and don't need for real data. Thank you! Happy Chinese New Year!

lagzxadr commented 4 years ago

I guess we can take 10x genomics example data to show the difference. pbmc.pro (https://support.10xgenomics.com/single-cell-gene-expression/datasets/3.0.2/5k_pbmc_protein_v3) is a bi-modal dataset with both cell surface protein and RNA. pbmc.v3 (https://support.10xgenomics.com/single-cell-gene-expression/datasets/3.0.2/5k_pbmc_v3) is for scRNA-seq only. Their cellranger output files are in the same format with barcodes.tsv.gz, feature.tsv.gz and matrix.mtx.gz . But pbmc.pro has two layers of data matrix, RNA and protein.

Thus, the code I used to create seurat object for pbmc.pro (bi-modal) is a bit different from that for pbmc.


pbmc.pro <- Read10X(here("/5k_protein/filtered_feature_bc_matrix")) # 10X data contains more than one type and is being returned as a list containing matrices of each type.

pbmc.pro <- CreateSeuratObject(counts = pbmc.pro[[1]], project = "pbmc_protein.multimodal", min.cells = 3, min.features = 200)

pbmc.pro

I need pbmc.pro[[1]] to refer to RNA assay only.

Here is the code for pbmc(the single modal data we usually see).


pbmc.v3 <- Read10X(here("/5k_v3/filtered_feature_bc_matrix"))

pbmc.v3 <- CreateSeuratObject(counts = pbmc.v3, project = "pbmc_v3.RNAonly", min.cells = 3, min.features = 200)

pbmc.v3

Hope I have made it clear.

wguo-research commented 4 years ago

I have updated the codes. You can reinstall and try it on your data .

lagzxadr commented 4 years ago

Thank you very much! It works perfect for my data now.