IntegrateData error, subscript out of bound

jhl0214 commented 5 years ago

Hi, I am trying to integrate 3 data sets of 26x18000, 26x24000, 26x38000. I created seurat objects with each data set and ran "data.anchors <- FindIntegrationAnchors". After that, I tried to run this code: data.combined <- IntegrateData(anchorset = data.anchors). However, it keeps giving me this error:

Merging dataset 3 into 1 Extracting anchors for merged samples Finding integration vectors Warning in irlba(A = t(x = object), nv = npcs, ...) : You're computing too large a percentage of total singular values, use a standard svd instead. Warning in irlba(A = t(x = object), nv = npcs, ...) : did not converge--results might be invalid!; try increasing work or maxit Finding integration vector weights Error in Embeddings(reduction)[nn.cells2, dims] : subscript out of bounds

Can anyone suggest how to fix this error? Thank you.

timoast commented 5 years ago

If you have only 26 features measured in each dataset you will need to adjust the default parameters. You should change the number of dimensions used in each step, the number of features used when filtering anchors (or turn the filtering off entirely by setting k.filter=NA), and change k.weight. All these parameters need to be less than the number of features measured.

JingqunMa commented 5 years ago

I have the same error when following the scATAC-seq integration vignette (https://satijalab.org/signac/articles/integration.html#integration-with-harmony). I've tried different parameters in the FindIntegratonAnchors(set k.filter=NA) and change the default parameters in IntegrateData (change k.weight, dims, weight.reduction). The same error persists: Merging dataset 2 into 1 Extracting anchors for merged samples Finding integration vectors Finding integration vector weights Error in Embeddings(reduction)[nn.cells2, dims] : subscript out of bounds

Any insights for this error or troubleshooting advice, please?

amjass12 commented 4 years ago

what does k.filter=NA achieve versus leaving it as the default? although it can't be guaranteed how the data will perform, is it a viable option to use this method of integration? in other words, where there are datasets with enough cells, should these be integrated as they would be if the other dataset without min=200 cells was present?

deborah-chasman commented 4 years ago

I got the same error message under different circumstances, and I was able to work through it. I'm leaving a comment here in case it's useful for someone in the future.

I was trying to integrate feature barcoding datasets from 10X Genomics. I have 11 features measured in thousands of cells across multiple samples. So in my case, it's not an issue of having too few cells, but having too few features.

I'm using the standard workflow, so these are the steps:

Step 1. FindIntegrationAnchors Step 2. IntegrateData

Initially I ran FindIntegrationAnchorslike so, and it ran successfully, but I got the same error as Jingqun Ma when I ran IntegrateData.

anchors <- FindIntegrationAnchors(object.list=split.datas, assay=rep("FB",length(split.datas)), dims=1:11, max.features=11)

anchor.data <- IntegrateData(anchorset = anchors, new.assay.name = "integrated_FB", dims=my.dims)

It seems that I need to set the anchor.features parameter, not the dims or max.features, in FindIntegrationAnchors in order to get IntegrateData to work. It works whether I set it to the names of my features (here, fb.names) or an integer up to the number of features (eg 11).

fb.names <- rownames(split.datas[[1]][["FB"]])
anchors <- FindIntegrationAnchors(object.list=split.datas, assay=rep("FB",length(split.datas)), anchor.features=fb.names, dims=1:10)

anchor.data <- IntegrateData(anchorset = anchors, new.assay.name = "integrated_FB", dims=1:10)

satijalab / seurat

IntegrateData error, subscript out of bound #1889