Closed FrickTobias closed 1 year ago
Now I also checked that it dose not have to do with the order of the cells in pbmc3k
# Modify data to add a subset group
library(randomizr)
set.seed(1)
subset_group <- simple_ra(N = num_cells, prob = 0.05)
pbmc3k <- AddMetaData(pbmc3k, subset_group, "subset_group")
I ran the pmbc3k
code above for different number of cells n
(naively taken as the first n cells to define group 0 or 1).
for (n_group_0 in seq(290, 310, 1)) {...}
> fails
[1] 290 291 292 293 295 296 297 298 299 303 304
> successes
[1] 294 300 301 302 305 306 307 308 309 310
Switch direction of integration
for (n_group_1 in seq(290, 310, 1)) {...}
> fails
NULL
> successes
[1] 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310
for (n_group_1 in seq(31, 332, 10)) {...}
> fails
[1] 31 41 51 71 81 91 101 111 121 131 141 151 171 181 191 201 211 221
> successes
[1] 61 161 231 241 251 261 271 281 291 301 311 321 331
res <- try(pbmc3k.combined <- IntegrateData(anchorset = pbmc.anchors))
if(inherits(res, "try-error")){
fails <- c(fails, n_group_0) # or group 1
} else {
successes <- c(successes, n_group_0) # or group 1
}
Same here. When I am performing 2nd round of integrative clustering on targeted cell types where each sample has ~50 cells. I received the same error when running FindIntegrationAnchors: number of items to replace is not a multiple of replacement length
I am also experiencing the same issue. Integrating datasets with 50 - 100 cells, and I get the same error when running IntegrateData()
@KoichiHashikawa @alice-y-wang
Note that this issue is about when both of these are true:
l2.norm = FALSE
If you are not using the l2.norm = FALSE
option you are experiencing another issue (you will note that my original post specifies when that option is set to TRUE (default) it works without any issues).
I have the same issue using the "IntegrateData()" function and it returned the error below. In my case, I used the default setting of I2.norm = TRUE when running FindIntegrationAnchors() function.
Error in idx[i, ] <- res[[i]][[1]] : number of items to replace is not a multiple of replacement length
@StanleyYang01 If you are running this with l2.norm = TRUE
it is not the same issue. Please see the title of the issue.
I have the same issue using the "IntegrateData()" function and it returned the error below. In my case, I used the default setting of I2.norm = TRUE when running FindIntegrationAnchors() function.
Error in idx[i, ] <- res[[i]][[1]] : number of items to replace is not a multiple of replacement length
Were you able to resolve this issue ?
@sentisci Please open a new issue or contact the person somewhere outside this thread if you wish to discuss something else than the described issue.
The issue here is not that l2.norm is somehow causing an error, the issue is that integration is failing because a small number of anchors are being identified.
Seurat integration (as with all integration techniques) leverages shared information across similar cells (or anchors) to increase robustness. When there are few anchors, the integration process can struggle to proceed. L2-normalization helps for correcting scale differences across datasets, and setting this to F will decrease the number of anchors returned.
You can attempt to address this by increasing k.anchor (to find more anchors), decreasing k.weight (which allows integration to proceed with fewer anchors). Both these parameter settings can increase the risk of over integration.
@rsatija Thank you for taking the time to answer! I'll try that!
Description
I've successfully integrated my two datasets using L2-normailzation but I get an error when it is set to
false
.FindIntegrationAnchors()
finishes without any (critical) error but somehow the output yields an error when input intoIntegrateData()
.Some things I have noted
pbmc3k
it works for a random split of the data but I can reproduce the error if I make an uneven split (100 / 2600).k.filter
to fix my issues.See below
sessionInfo()
for details of my testing & output.My original issue
l2.norm = TRUE (works)
l2.norm = FALSE (doesn't work)
Error reproduced in pbmc3k
Traceback
OS and sessionInfo()
System Software Overview:
System Version: macOS 11.6.2 (20G314) Kernel Version: Darwin 20.6.0
Tests and outputs
Uneven split works for
l2.norm=TRUE
k.filter = 99
doesn't helpEven split is no problem
This is the code I used to randomly split the data.