Single-Cell-Genomics-Group-CNAG-CRG / Tumor-Immune-Cell-Atlas

Code repository for the Tumor Immune Cell Atlas (TICA) project
59 stars 12 forks source link

counts are not integers. #10

Closed bjstewart1 closed 3 years ago

bjstewart1 commented 3 years ago

the counts in adata.raw.X are not all integers

library(reticulate)
sc <- import("scanpy")
adata <- sc$read_h5ad("data/TICAtlas.h5ad")
adata$X <- adata$raw$X
rs <- Matrix::rowSums(adata$X)
all(rs == as.integer(rs)) #returns false
aurelieGabriel commented 3 years ago

Hello, I was also wondering about the processing of the data that are available on Zenodo, since the counts slot in the Seurat objects are not integers. Would it be possible to have more information on the processing steps performed on the raw counts? Thank you for your help and for the great ressources made available! Aurélie

PaulaNietoG commented 3 years ago

Hello! This is because for one dataset (breast) the "raw" data was in TPMs rather than raw counts, so it is not really raw counts, but this is the best we could get. @aurelieGabriel regarding the processing of the raw counts, all we did was filter out non-immune cells (although we did more filtering after integration) and the rest is better detailed in the integration folder of this repository. Hope this was helpful!

aurelieGabriel commented 3 years ago

Hello, I apologize for the delay.. Thank you for your answer, it helps indeed. I saw though that non-integer values are also found in the following "source" datasets: liver2, lung1 and melanoma1. Were those samples considered differently for the integration step and the generation of the Atlas? Thank you for your help. Best wishes, Aurélie