Error: grouping factors must have > 1 sampled level

yecotoo commented 1 month ago

out <- scDist(normalized_counts = sim$Y,

meta.data = sim$meta.data,
d = 13,
fixed.effects = "group",
random.effects = c("celltype", "orig.ident"),
clusters="seurat_clusters",
min.counts.per.cell = 1,
) ====================================================================================================================================================================Error: grouping factors must have > 1 sampled level In addition: Warning message: 1: Model failed to converge with 1 negative eigenvalue: -1.5e+01 2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : unable to evaluate scaled gradient 3: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model failed to converge: degenerate Hessian with 1 negative eigenvalues 4: Model failed to converge with 1 negative eigenvalue: -1.6e+00 5: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model failed to converge with max|grad| = 0.00222403 (tol = 0.002, component 1) 6: Model failed to converge with 1 negative eigenvalue: -9.1e-02 I have checked my group data:

table(sim$meta.data$group) SCI_1dpi SCI_3dpi 2642 5193 I organized my data according to the sample data and checked it, nothing wrong was found. What is the possible cause? Thanks!

phillipnicol commented 1 month ago

My first guess would be that there is a cluster in seurat_clusters such that all of the cells belong to one group. You can check this by table(sim$meta.data$group, seurat_clusters).

In this case, you can just remove that cluster... But I can also edit the code to just return 0 for that cell type if this happens.

yecotoo commented 1 month ago

Thanks for reply! I checked my data and find cluster 6 in such that all of the cells belong to one group. table(sim$meta.data$group, sim$meta.data$seurat_clusters)
0 1 2 3 4 5 6 7 8 9 SCI_3dpi 1663 2152 260 593 381 90 0 19 26 9 SCI_1dpi 15 6 1441 696 100 95 163 106 4 16 I would be appreciated if you could edit the code. Thanks!

yecotoo commented 1 month ago

My first guess would be that there is a cluster in such that all of the cells belong to one group. You can check this by .seurat_clusters``table(sim$meta.data$group, seurat_clusters)

In this case, you can just remove that cluster... But I can also edit the code to just return 0 for that cell type if this happens.

Hello! I met this error again after I remove cluster 6. And I don't know why. I put my data information below. out <- scDist(normalized_counts = sim$Y,

meta.data = sim$meta.data,
d = 13,
fixed.effects = "group",
random.effects = c("celltype", "orig.ident"),
clusters="seurat_clusters",
min.counts.per.cell = 1,
) =============================================================================================================================================================================================================Error: grouping factors must have > 1 sampled level In addition: Warning message: 1: Model failed to converge with 1 negative eigenvalue: -6.7e+01 2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : unable to evaluate scaled gradient 3: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model failed to converge: degenerate Hessian with 1 negative eigenvalues 4: Model failed to converge with 1 negative eigenvalue: -6.1e+01 5: Model failed to converge with 1 negative eigenvalue: -2.3e+01 6: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model failed to converge with max|grad| = 0.00213607 (tol = 0.002, component 1) 7: Model failed to converge with 1 negative eigenvalue: -1.1e+00

yecotoo commented 1 month ago

Interesting, When the "random.effects = c("celltype", "orig.ident")" was commented out. It worked and show me this warning below. I wonder if the code of the "random.effect" function need to be fixed as the "fixed.effects"? out <- scDist(normalized_counts = sim$Y,

meta.data = sim$meta.data,
d = 13,
fixed.effects = "group",
random.effects = c("celltype", "orig.ident"),
clusters="seurat_clusters",
min.counts.per.cell = 1,
) ======================================================================================================================================================================================================================================================================(function (A, nv = 5, nu = nv, maxit = 1000, work = nv + 7, reorth = TRUE, Warning: You're computing too large a percentage of total singular values, use a standard svd instead.

yecotoo commented 1 month ago

Sorry to bother, when I tried this in another data, something new happened. The normalization method I used is not SCT, Could it be the reason of this error?

> out <- scDist(normalized_counts = sim$Y, + meta.data = sim$meta.data, + d = 15, + fixed.effects = "group", + #random.effects = c("celltype", "orig.ident"), + clusters="seurat_clusters", + min.counts.per.cell = 1, + ) ============================================================(function (A, nv = 5, nu = nv, maxit = 1000, work = nv + 7, reorth = TRUE, Warning Here: You're computing too large a percentage of total singular values, use a standard svd instead. ===============Error in (function (A, nv = 5, nu = nv, maxit = 1000, work = nv + 7, reorth = TRUE, : max(nu, nv) must be strictly less than min(nrow(A), ncol(A)).

phillipnicol commented 1 month ago

First, the errors and warnings in function (A, nv = 5, nu = nv, maxit = 1000, work = nv + 7, reorth = TRUE) are related to the PCA step. Basically, if there is a cluster with less than d=15 cells then it is not possible to get 15 PCs. I will go into the code and make sure min.counts.per.cell is greater than d. It is fine to use the method without scTransform, we just wanted to make a recommendation.

Regarding the previous error, it would definitely be the case that cluster 6 would cause the grouping factor to have only one sampled level. But I'm not sure why it would persist after cluster 6 was removed like you said. One thing to try would be moving orig.ident to a fixed effect by writing fixed.effects=c("group, orig.ident"). The reason is that the mixed model might not have much power when you only have 2 samples in one group, I also wrote about that in Issue #4. Also, can I ask what is the celltype variable? If it is really the cell type, then that should be what goes in the cluster argument.

phillipnicol commented 1 month ago

I have edited some code that I think will automatically detect if any of the random or fixed effects have only level for a particular cell type. You can install this developmental version of the package with devtools::install_github("phillipnicol/scDist", ref="sampling_level"). This also checks that the minimum number of cells is larger than the chosen number of PCs, so it should also address the other errors that you encountered.

Let me know if it works for you and then I can integrate it into the main package.

phillipnicol commented 2 weeks ago

The code fixing this is included in the updated package.

phillipnicol / scDist

Error: grouping factors must have > 1 sampled level #12

random.effects = c("celltype", "orig.ident"),