Model is nearly unidentifiable: very large eigenvalue

algebio commented 4 years ago

Hi Lukas

I'm running the Cytof Workflow pipeline. I get a very long message (attached below) after running

ds_res2 <- diffcyt(sce, formula = ds_formula2, contrast = contrast, analysis_type = "DS", method_DS = "diffcyt-DS-LMM", clustering_to_use = "merging1b", verbose = FALSE)

I have read in other forums that I could solve this by changing some parameters. My two questions are:

How can I change in diffcyt() any of the recommended parameters if needed? Do I need to change it or does this message mean that my data doesn't have any significant result and therefore I shouldn't change anything?

My design, contrast and formula are:

md file_name sample_id condition patient_id batch intervention week_of_life trial_arm 1 P_31B_1.fcs P_31B Neonate P_31 B_1 post week_6 Intervention 2 P_32A_1.fcs P_32A Neonate P_32 B_1 pre week_3 Control 3 P_32B_1.fcs P_32B Neonate P_32 B_1 post week_7 Control 4 P_34A_1.fcs P_34A Neonate P_34 B_1 pre week_3 Control 5 P_35A_1.fcs P_35A Neonate P_35 B_1 pre week_3 Intervention 6 P_36A_2.fcs P_36A Neonate P_36 B_2 pre week_2 Control

ei <- metadata(sce)$experiment_info (da_formula1 <- createFormula(ei, cols_fixed = "trial_arm", cols_random = "sample_id")) (da_formula2 <- createFormula(ei, cols_fixed = "trial_arm", cols_random = c("sample_id", "patient_id")))

contrast <- createContrast(c(0, 1))

da_res1 <- diffcyt(sce, formula = da_formula1, contrast = contrast, analysis_type = "DA", method_DA = "diffcyt-DA-GLMM", clustering_to_use = "merging1b", verbose = FALSE) da_res2 <- diffcyt(sce, formula = da_formula2, contrast = contrast, analysis_type = "DA", method_DA = "diffcyt-DA-GLMM", clustering_to_use = "merging1b", verbose = FALSE)

table(rowData(da_res1$res)$p_adj < FDR_cutoff) table(rowData(da_res2$res)$p_adj < FDR_cutoff) FALSE TRUE 15 2

FALSE TRUE 16 1

ds_formula1 <- createFormula(ei, cols_fixed = "trial_arm") ds_formula2 <- createFormula(ei, cols_fixed = "trial_arm", cols_random = "patient_id")

ds_res1 <- diffcyt(sce, formula = ds_formula1, contrast = contrast, analysis_type = "DS", method_DS = "diffcyt-DS-LMM", clustering_to_use = "merging1b", verbose = FALSE) table(rowData(ds_res1$res)$p_adj < FDR_cutoff) ds_res2 <- diffcyt(sce, formula = ds_formula2, contrast = contrast, analysis_type = "DS", method_DS = "diffcyt-DS-LMM", clustering_to_use = "merging1b", verbose = FALSE) table(rowData(ds_res2$res)$p_adj < FDR_cutoff)

Thank you for your help.

Regards Juan

Model failed to converge with max|grad| = 0.015349 (tol = 0.002, component 1)Model is nearly unidentifiable: very large eigenvalue

Rescale variables?Model is nearly unidentifiable: very large eigenvalue Rescale variables?Model failed to converge with max|grad| = 0.00233629 (tol = 0.002, component 1)boundary (singular) fit: see ?isSingular Model is nearly unidentifiable: very large eigenvalue Rescale variables?Model failed to converge with max|grad| = 0.00555007 (tol = 0.002, component 1)Model failed to converge with max|grad| = 0.00742892 (tol = 0.002, component 1)Model failed to converge with max|grad| = 0.00240135 (tol = 0.002, component 1)Model is nearly unidentifiable: very large eigenvalue Rescale variables?Model failed to converge with max|grad| = 0.0223681 (tol = 0.002, component 1)Model failed to converge with max|grad| = 0.0051913 (tol = 0.002, component 1)Model failed to converge with max|grad| = 0.00238886 (tol = 0.002, component 1)boundary (singular) fit: see ?isSingular Model failed to converge with max|grad| = 0.0199976 (tol = 0.002, component 1)Model failed to converge with max|grad| = 0.0389122 (tol = 0.002, component 1)Model failed to converge with max|grad| = 0.0047067 (tol = 0.002, component 1)Model failed to converge with max|grad| = 0.00297547 (tol = 0.002, component 1)Model is nearly unidentifiable: very large eigenvalue Rescale variables?Model is nearly unidentifiable: very large eigenvalue Rescale variables?Model failed to converge with max|grad| = 0.0152538 (tol = 0.002, component 1)Model failed to converge with max|grad| = 0.00844551 (tol = 0.002, component 1)Model failed to converge with max|grad| = 0.0313113 (tol = 0.002, component 1)Model failed to converge with max|grad| = 0.022935 (tol = 0.002, component 1)Model is nearly unidentifiable: very large eigenvalue Rescale variables?Model failed to converge with max|grad| = 0.0240415 (tol = 0.002, component 1)Model is nearly unidentifiable: very large eigenvalue Rescale variables?Model failed to converge with max|grad| = 0.00217955 (tol = 0.002, component 1)Model is nearly unidentifiable: very large eigenvalue Rescale variables?Model failed to converge with max|grad| = 0.0241478 (tol = 0.002, component 1)Model failed to converge with max|grad| = 0.00663443 (tol = 0.002, component 1)Model failed to converge with max|grad| = 0.00649518 (tol = 0.002, component 1)boundary (singular) fit: see ?isSingular Model failed to converge with max|grad| = 0.055191 (tol = 0.002, component 1)Model is nearly unidentifiable: very large eigenvalue Rescale variables?Model failed to converge with max|grad| = 0.104247 (tol = 0.002, component 1)Model is nearly unidentifiable: very large eigenvalue Rescale variables?Model failed to converge with max|grad| = 0.0157847 (tol = 0.002, component 1)Model is nearly unidentifiable: very large eigenvalue Rescale variables?Model is nearly unidentifiable: very large eigenvalue Rescale variables?boundary (singular) fit: see ?isSingular Model failed to converge with max|grad| = 0.0102596 (tol = 0.002, component 1)Model is nearly unidentifiable: very large eigenvalue Rescale variables?boundary (singular) fit: see ?isSingular [.....I have cut most of it....]

Rescale variables?Model failed to converge with max|grad| = 0.00365807 (tol = 0.002, component 1)Model is nearly unidentifiable: very large eigenvalue Rescale variables?Model failed to converge with max|grad| = 0.00612798 (tol = 0.002, component 1)Model failed to converge with max|grad| = 0.00824278 (tol = 0.002, component 1)Model failed to converge with max|grad| = 0.00280907 (tol = 0.002, component 1)boundary (singular) fit: see ?isSingular Model failed to converge with max|grad| = 0.00516716 (tol = 0.002, component 1)Model failed to converge with max|grad| = 0.00521023 (tol = 0.002, component 1)boundary (singular) fit: see ?isSingular Model is nearly unidentifiable: very large eigenvalue Rescale variables?Model failed to converge with max|grad| = 0.00921459 (tol = 0.002, component 1)boundary (singular) fit: see ?isSingular boundary (singular) fit: see ?isSingular boundary (singular) fit: see ?isSingular boundary (singular) fit: see ?isSingular Model failed to converge with max|grad| = 0.00678294 (tol = 0.002, component 1)Model is nearly unidentifiable: very large eigenvalue Rescale variables?boundary (singular) fit: see ?isSingular boundary (singular) fit: see ?isSingular boundary (singular) fit: see ?isSingular boundary (singular) fit: see ?isSingular boundary (singular) fit: see ?isSingular boundary (singular) fit: see ?isSingular

SamGG commented 4 years ago

Hi, I am not an expert at modelling, so my opinion is not authoritative. The ids in sample_id are unique, so there is no reason to put this column in the design. In patient_id, only P32 is repeated only once. So it might be OK, and I think the first formula also. Alternatively, as there is only one patient repeated, you could try to ignore it and remove leave random effect empty. I think the real interest of your experiment is to test pre vs post operative while controlling the trial arm. This is quite difficult currently as there are too few sample in the post operative group. What I would test currently is whether the control and the intervention groups are similar, but again there is not enough samples now. As getting samples might be difficult and variability between patients is usually large, it would be very interesting to get samples per and post from the same patients. In the end, looking at the samples in your hand, I think the design is maybe too rich. Hope this helps.

2020-11-10 15_59_43-Window

algebio commented 4 years ago

Hi Sam

Thanks again for your quick answer.

I don't understand this sentence: "The ids in sample_id are unique, so there is no reason to put this column in the design". I am confused because the sample data in the Nowicka's paper has unique sample_id names and the design is:

ei <- metadata(sce)$experiment_info (da_formula1 <- createFormula(ei, cols_fixed = "condition", cols_random = "sample_id"))

Also, what do you mean by "leave random effect empty"? I need something to compare to cols_fixed.

"What I would test currently is whether the control and the intervention groups are similar, but again there is not enough samples now." This is actually what I'm trying to compare by using cols_fixed = "trial_arm".

Could the reason of this problem be the low number of samples? In that case why do I get large eigenvalue?

We don't have more samples. If this is the problem, we are in trouble.

Thanks for your feedback.

Regards Juan

SamGG commented 4 years ago

As I warned you, I am not expert. I answered you in the case it could help.

You are right, sample_id is defined in Nowick's article although all values are unique. I will have to check/correct my knowledge about mixed model.

patient_id makes sense to be present as random effect as we clearly want to remove any effect due to patient. Did you try patient_id without sample_id as those two variables are mostly redundant?

Concerning "leave random effect empty", I mean not specifying any variable in the random effect if that makes sense.

Concerning the low number of samples, I think Lukas or Mark would answer better than me. IMHO, two samples for estimating the percentage of a population or the expression of a functional/signaling marker in the intervention group sounds optimistic.

Sorry if my answers get you into trouble. Best regards, Samuel

algebio commented 4 years ago

Hi Samuel

Your answers are always very welcome. You have helped me several times and I really appreciate your effort to help me today as well. I will do the same if I have the chance.

Regards Juan

SamGG commented 4 years ago

Thanks for your kind message. For sure, it is a pleasure to receive help from you or others. Best regards, Samuel

lmweber / diffcyt

Model is nearly unidentifiable: very large eigenvalue #25