Closed aken2 closed 5 years ago
Hi - I am running into similar issues (using mclust). I can get solutions for models 1 & 3. Models 4 & 5 say it can't be done with mclust (so fair enough). But for models 2 & 6 I am getting the same error message as above. Thoughts?? Thanks! Cheers, Emily
If you can email a fully reproducible syntax and your data (or simulated mock data giving the same error) to c.j.vanlissa@uu.nl I can try to debug this for you!
Wondering if this is a more general issue that others are experiencing, too. @cjvanlissa, any idea if the bootLRTS
-related code might have broken in one of our updates?
It's not triggering any unit tests.. so probably not. A reproducible example would be helpful to identify any problems!
Thank you so much! Sending you some mock data generating the same error now. Any tips are much appreciated!
Debugged this, and it turns out the error originates in Mclust, which can be verified by running:
mclustBootstrapLRT(your_data, modelName = "VVV", nboot = 100, maxG = 4)
Only the 1-class model converges (my guess is that the rest is too complex), and when mclustBootstrapLRT tries to compare the 1-class model against something else, there is nothing to compare it to, and you get this error. I'll see if I can wrap the error message, but that's about all I can do.
FYI: I wrapped the error, and now your data returns the following output with informative error messages:
Data_imputed %>%
+ estimate_profiles(n_profiles = 1:5, models = 6)
The 'variances'/'covariances' arguments were ignored in favor of the 'models' argument.
Warning messages:
1: Mclust could not estimate model 6 with 2 classes.
2: Mclust could not estimate model 6 with 3 classes.
3: Mclust could not estimate model 6 with 4 classes.
4: Mclust could not estimate model 6 with 5 classes.
5:
One or more analyses resulted in warnings! Examine these analyses carefully: model_6_class_2, model_6_class_3, model_6_class_4, model_6_class_5
> tmp
tidyLPA analysis using mclust:
Model Classes AIC BIC Entropy prob_min prob_max n_min n_max BLRT_p
6 1 2266.47 2313.85 1.00 1.00 1.00 1.00 1.00
6 2
6 3
6 4
6 5
Thanks. Not that anything is actually fixed or explained, but if that's all you can do then I suppose tidyLPA is just a more limited tool than I initially thought. Ah well.... Cheers, Emily
On Tue, Jul 23, 2019 at 9:57 AM C. J. van Lissa notifications@github.com wrote:
FYI: I wrapped the error, and now your data returns the following output with informative error messages:
` Data_imputed %>%
-
estimate_profiles(n_profiles = 1:5, models = 6)
The 'variances'/'covariances' arguments were ignored in favor of the 'models' argument. Warning messages: 1: Mclust could not estimate model 6 with 2 classes. 2: Mclust could not estimate model 6 with 3 classes. 3: Mclust could not estimate model 6 with 4 classes. 4: Mclust could not estimate model 6 with 5 classes. 5: One or more analyses resulted in warnings! Examine these analyses carefully: model_6_class_2, model_6_class_3, model_6_class_4, model_6_class_5
tmp tidyLPA analysis using mclust:
Model Classes AIC BIC Entropy prob_min prob_max n_min n_max BLRT_p 6 1 2266.47 2313.85 1.00 1.00 1.00 1.00 1.00 6 2 6 3 6 4 6 5 `
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/data-edu/tidyLPA/issues/125?email_source=notifications&email_token=AJBVEOW36LDYXXOGMNQQVZLQA4Z7NA5CNFSM4IFZ2RO2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2TYO7I#issuecomment-514295677, or mute the thread https://github.com/notifications/unsubscribe-auth/AJBVEOVEBTWB6F6AYYDOGYLQA4Z7NANCNFSM4IFZ2ROQ .
Emily A. Butler
Professor & Graduate Director Family Studies and Human Development College of Agriculture & Life Sciences University of Arizona Tucson, AZ, 85721-0033
Something seems confusing to me: mclust can't fit model types 4 and 5, and yet can fit the other four model types (1, 2, 3, and 6), but sometimes doesn't because of an error in the estimation. Should these two sources of a model not being able to be estimated be distinguished in the output?
and perhaps there is something that could be done, such as increasing iterations or something?? Cheers, Emily
On Tue, Jul 23, 2019 at 4:08 PM Joshua Rosenberg notifications@github.com wrote:
Something seems confusing to me: mclust can't fit model types 4 and 5, and yet can fit the other four model types (1, 2, 3, and 6), but sometimes doesn't because of an error in the estimation. Should these two sources of a model not being able to be estimated be distinguished in the output?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/data-edu/tidyLPA/issues/125?email_source=notifications&email_token=AJBVEOU5KAVS55QSF5NZWI3QA6FQLA5CNFSM4IFZ2RO2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2UV2YQ#issuecomment-514415970, or mute the thread https://github.com/notifications/unsubscribe-auth/AJBVEOSZLVJ4G5Z2PF6PZL3QA6FQLANCNFSM4IFZ2ROQ .
Emily A. Butler
Professor & Graduate Director Family Studies and Human Development College of Agriculture & Life Sciences University of Arizona Tucson, AZ, 85721-0033
Something seems confusing to me: mclust can't fit model types 4 and 5, and yet can fit the other four model types (1, 2, 3, and 6), but sometimes doesn't because of an error in the estimation. Should these two sources of a model not being able to be estimated be distinguished in the output?
Josh, this IS referenced in the output. If you request model type 4/5 with Mclust, estimate_profiles gives an error (as it should) ;)
Thanks. Not that anything is actually fixed or explained, but if that's all you can do then I suppose tidyLPA is just a more limited tool than I initially thought. Ah well.... Cheers, Emily
The fact that the model is too complex to estimate is a research finding that can be reported, not a bug to be ironed out. In your case, model 6 estimates 19 parameters PER CLASS, with 218 participants. So the two-class solution has less than 6 participants per parameter.
Thanks - I wasn't trying to estimate model 6 and that isn't my data you are referencing. That was the other person who submitted an issue - I was interested in model 2. Once I saw there was an issue I just tried each model to see what behavior ensued with my data to see if I had the same issue.
And BTW, I take it you changed the way get_data behaves again? My code using it is broken again and it appears to be due to different behavior for that function. So, I think you've convinced me to learn to use mclust myself. Thanks for the push to quit being so lazy :)
Cheers, Emily
On Tue, Jul 23, 2019 at 11:41 PM C. J. van Lissa notifications@github.com wrote:
Thanks. Not that anything is actually fixed or explained, but if that's all you can do then I suppose tidyLPA is just a more limited tool than I initially thought. Ah well.... Cheers, Emily
The fact that the model is too complex to estimate is a research finding that can be reported, not a bug to be ironed out. In your case, model 6 estimates 19 parameters PER CLASS, with 218 participants. So the two-class solution has less than 6 participants per parameter.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/data-edu/tidyLPA/issues/125?email_source=notifications&email_token=AJBVEOX2Q7MKXJDZ643W22TQA72STA5CNFSM4IFZ2RO2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2VK3AI#issuecomment-514502017, or mute the thread https://github.com/notifications/unsubscribe-auth/AJBVEOR55KWODWPA4OPSNV3QA72STANCNFSM4IFZ2ROQ .
Emily A. Butler
Professor & Graduate Director Family Studies and Human Development College of Agriculture & Life Sciences University of Arizona Tucson, AZ, 85721-0033
All good @ebmtnprof. We did change that - we are about to push that release to CRAN. A number of folks asked for the data in wide format, hence the change. It is still possible to obtain the data in long form; I'll post how later today.
@ebmtnprof I think the easiest way (to me) would be to use the gather function from the tidyr package, e.g.:
library(tidyLPA)
#> tidyLPA is intended for academic use. We do not make any money on this and only ask that you please cite this in publications when you use the results. You can use the function citation('tidyLPA') to create a citation.Mplus is not installed. Use only package = 'mclust' when calling estimate_profiles().
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
m <- pisaUSA15[1:100, ] %>%
select(broad_interest, enjoyment, self_efficacy) %>%
single_imputation() %>%
estimate_profiles(3)
get_data(m) %>%
tidyr::gather(Class_prob, Probability, contains("CPROB"))
#> # A tibble: 300 x 8
#> model_number classes_number broad_interest enjoyment self_efficacy Class
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 3 3.8 4 1 1
#> 2 1 3 3 3 2.75 3
#> 3 1 3 1.8 2.8 3.38 2
#> 4 1 3 1.4 1 2.75 2
#> 5 1 3 1.8 2.2 2 3
#> 6 1 3 1.6 1.6 1.88 3
#> 7 1 3 3 3.8 2.25 1
#> 8 1 3 2.6 2.2 2 3
#> 9 1 3 1 2.8 2.62 3
#> 10 1 3 2.2 2 1.75 3
#> # … with 290 more rows, and 2 more variables: Class_prob <chr>,
#> # Probability <dbl>
Created on 2019-07-25 by the reprex package (v0.3.0)
Thanks for the response. You'll be glad to hear I won't be bugging you anymore :) Yesterday I took the plunge and got mclust working for what I need, so my package is no longer reliant on a stable version of tidyLPA. Cheers, Emily
On Thu, Jul 25, 2019 at 10:51 AM Joshua Rosenberg notifications@github.com wrote:
@ebmtnprof https://github.com/ebmtnprof I think the easiest way (to me) would be to use the gather function from the tidyr package, e.g.:
library(tidyLPA)
> tidyLPA is intended for academic use. We do not make any money on this and only ask that you please cite this in publications when you use the results. You can use the function citation('tidyLPA') to create a citation.Mplus is not installed. Use only package = 'mclust' when calling estimate_profiles().
library(dplyr)
>
> Attaching package: 'dplyr'
> The following objects are masked from 'package:stats':
>
> filter, lag
> The following objects are masked from 'package:base':
>
> intersect, setdiff, setequal, union
m <- pisaUSA15[1:100, ] %>%
select(broad_interest, enjoyment, self_efficacy) %>% single_imputation() %>% estimate_profiles(3)
get_data(m) %>%
tidyr::gather(Class_prob, Probability, contains("CPROB"))
> # A tibble: 300 x 8
> model_number classes_number broad_interest enjoyment self_efficacy Class
>
> 1 1 3 3.8 4 1 1
> 2 1 3 3 3 2.75 3
> 3 1 3 1.8 2.8 3.38 2
> 4 1 3 1.4 1 2.75 2
> 5 1 3 1.8 2.2 2 3
> 6 1 3 1.6 1.6 1.88 3
> 7 1 3 3 3.8 2.25 1
> 8 1 3 2.6 2.2 2 3
> 9 1 3 1 2.8 2.62 3
> 10 1 3 2.2 2 1.75 3
> # … with 290 more rows, and 2 more variables: Class_prob
, > # Probability
Created on 2019-07-25 by the reprex package https://reprex.tidyverse.org (v0.3.0)
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/data-edu/tidyLPA/issues/125?email_source=notifications&email_token=AJBVEOVZUM4SDL6YTWADJWTQBHRZJA5CNFSM4IFZ2RO2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD22IFIA#issuecomment-515146400, or mute the thread https://github.com/notifications/unsubscribe-auth/AJBVEOTNTKSDMCZ4ACKYMVLQBHRZJANCNFSM4IFZ2ROQ .
Emily A. Butler
Professor & Graduate Director Family Studies and Human Development College of Agriculture & Life Sciences University of Arizona Tucson, AZ, 85721-0033
I am looking to compare solutions for an LPA model with varying covariance (model 6 arg) and I am encountering an error. I am able to create models 1 and 3 but I am unsure of how to resolve this issue. Thanks for your help!