dviraran / SingleR

SingleR: Single-cell RNA-seq cell types Recognition (legacy version)
GNU General Public License v3.0
266 stars 98 forks source link

combine different datasets with SingleR.Combine? #32

Closed hfberg closed 5 years ago

hfberg commented 5 years ago

I wanted to combine a control and stimulated dataset to look at DE genes in the SingleRbrowseR. My plan was to give "ctrl" or "stim" as original identities for the two sets, create a SinglerSeurat object for both sets and combine them. I managed to view both of the datasets ctrl.rds and stim.rds in the browser, so that bit of code works fine. I get an error at the SingleR.Combine:

> comb_ctrl.stim=SingleR.Combine(c(singler_ctrl, singler_ctrl))
Error in if (singler.list[[i]]$singler[[j]]$about$RefData != singler.list[[1]]$singler[[j]]$about$RefData) { : 
  argument is of length zero
> traceback()
1: SingleR.Combine(c(singler_ctrl, singler_ctrl))

The RefData is created automatically as Immgen for both ctrl and stim. In the example below I've tried to define xy also, with the same result. (Please note that I'm creating a Seurat object from an object that already is Seurat, I will remove this soon). Is my idea doable at all? In that case, what is the problem with RefData?

#creating original identity text documents for CTRL and STIM
write_ctrl=rep("ctrl", length(colnames(ctrl@data)))
write.table(write_ctrl, file="ctrl_orig.ident.txt", sep="\t", eol ="\n",row.names=colnames(ctrl@data))

write_stim=rep("stim", length(colnames(stim@data)))
write.table(write_stim, file="stim_orig.ident.txt", sep="\t", eol ="\n",row.names=colnames(stim@data))

#creating SinglerSeurat objects for CTRL and STIM
ctrl_raw <- as.matrix(x=ctrl@data)
singler_ctrl <- CreateSinglerSeuratObject(counts = ctrl_raw, annot="ctrl_orig.ident.txt", project.name = 'CP1 small', min.genes = 500, min.cells = 1, technology = "Microwell-seq", species = "Mouse", npca = 10, fine.tune = T)
singler_ctrl.new = convertSingleR2Browser(singler_ctrl)
saveRDS(singler_ctrl.new, 'ctrl.rds')

stim_raw <- as.matrix(x=stim@data)
singler_stim <- CreateSinglerSeuratObject(counts = stim_raw, annot= "stim_orig.ident.txt", project.name = 'CS1 small', min.genes = 500, min.cells = 1, technology = "Microwell-seq", species = "Mouse", npca = 10, fine.tune = T)
singler_stim.new = convertSingleR2Browser(singler_stim)
saveRDS(singler_stim.new, 'stim.rds')

#Combining CTRL and STIM
comb_ctrl.stim=SingleR.Combine(c(singler_ctrl, singler_stim), xy = c(singler_ctrl[["seurat"]]@dr[["tsne"]]@cell.embeddings,singler_stim[["seurat"]]@dr[["tsne"]]@cell.embeddings))
singler_ctrl.stim.new = convertSingleR2Browser(comb_ctrl.stim)
saveRDS(singler_ctrl.stim.new, 'ctrl_stim.rds')
dviraran commented 5 years ago

Can you try replacing the input to a list? I mean - list(singler_ctrl,singler_stim)

hfberg commented 5 years ago
list_samples<- c(singler_ctrl, singler_stim)
list_xy <- c(singler_ctrl[["seurat"]]@dr[["tsne"]]@cell.embeddings,singler_stim[["seurat"]]@dr[["tsne"]]@cell.embeddings)

comb_ctrl.stim=SingleR.Combine(list_samples, xy = list_xy)

Gives the same error:

Error in if (singler.list[[i]]$singler[[j]]$about$RefData != singler.list[[1]]$singler[[j]]$about$RefData) { : 
  argument is of length zero

But I don't want to create a list that way since the meta.data seems to merge for some reason, giving both the samples orig.ident 'ctrl', xy has the same values for both singler objects and so on. (see pic for reference)

screenshot from 2019-02-18 15-07-41

one sample should be called CP1_dge_small and the other CS1_dge_small.

hfberg commented 5 years ago

ok, wait, now i got something else. Gonna have a look at it:


> list_samples<- list(singler_ctrl, singler_stim)
> list_xy <- list(singler_ctrl[["seurat"]]@dr[["tsne"]]@cell.embeddings,singler_stim[["seurat"]]@dr[["tsne"]]@cell.embeddings)
> 
> comb_ctrl.stim=SingleR.Combine(list_samples, xy = list_xy)
> singler_ctrl.stim.new = convertSingleR2Browser(comb_ctrl.stim)
Error in `row.names<-.data.frame`(`*tmp*`, value = value) : 
  duplicate 'row.names' are not allowed
In addition: Warning message:
non-unique values when setting 'row.names': ‘AACCTAAACCTATTAACT’, ‘AACCTAGTGGTACCGACG’, ‘AACCTATAGCATTTTAGG’, ‘ACACCCGAATTATACTTC’, ‘ACACCCGTAATGGATCTT’,  [... truncated] 
hfberg commented 5 years ago

yea, the "list" seems to be the right thing to use in terms of structure. This keeps all information I need at least.

screenshot from 2019-02-18 15-32-56


Edit: I see now that a lot of the other values are still the same.. xy for example. Got a little ahead of myself :(

hfberg commented 5 years ago

I could add a _ctrl or a _stim after each rowname, that wolud sove it :)

hfberg commented 5 years ago

Ok, don't spend time on this!! coding issue from me.