data-edu / tidyLPA

Easily carry out Latent Profile Analysis (LPA) using open-source or commercial software
https://data-edu.github.io/tidyLPA/
Other
56 stars 16 forks source link

get_data should return data in wide format #117

Closed cjvanlissa closed 5 years ago

cjvanlissa commented 5 years ago

Or create function that merges LPA results with original data file

jrosen48 commented 5 years ago

I agree this is a good idea

jrosen48 commented 5 years ago

related to #118

Is there a straightforward way to do something like this using base (especially for spread():

library(tidyLPA)
#> tidyLPA has received a major update, with a much easier workflow and improved functionality. However, you might have to update old syntax to account for the new workflow. See vignette('introduction-to-major-changes') for details!
#> 
#> Mplus is not installed. Use only package = 'mclust' when calling estimate_profiles().
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(tidyr)

m <- pisaUSA15[1:100, ] %>%
    select(broad_interest, enjoyment, self_efficacy) %>%
    single_imputation() %>%
    estimate_profiles(3)

get_data(m) %>% 
    mutate(Class_prob = paste0("Class_", Class_prob)) %>% 
    spread(Class_prob, Probability)
#> # A tibble: 100 x 10
#>    model_number classes_number broad_interest enjoyment self_efficacy Class
#>           <dbl>          <dbl>          <dbl>     <dbl>         <dbl> <dbl>
#>  1            1              3          0.853       2.4          1.57     3
#>  2            1              3          1           1            4        2
#>  3            1              3          1           1.2          3.5      2
#>  4            1              3          1           2            2        3
#>  5            1              3          1           2            3        3
#>  6            1              3          1           2.2          2.29     3
#>  7            1              3          1           2.6          1.86     3
#>  8            1              3          1           2.6          2.38     3
#>  9            1              3          1           2.8          2.62     3
#> 10            1              3          1           4            1.5      1
#> # … with 90 more rows, and 4 more variables: id <int>, Class_1 <dbl>,
#> #   Class_2 <dbl>, Class_3 <dbl>

Created on 2019-07-17 by the reprex package (v0.2.1)

cjvanlissa commented 5 years ago

See my pull request: get_data returns data in wide format when applied to an object of class tidyProfile (one element of a tidyLPA object), or when applied to a tidyLPA object of length one. It returns long format when applied to a tidyLPA object containing multiple tidyProfile analyses (because then the wide format does not make sense).

jrosen48 commented 5 years ago

Thanks sir!


From: C. J. van Lissa notifications@github.com Sent: Wednesday, July 17, 2019 3:21 PM To: data-edu/tidyLPA Cc: Joshua Rosenberg; Comment Subject: Re: [data-edu/tidyLPA] get_data should return data in wide format (#117)

See my pull request: get_data returns data in wide format when applied to an object of class tidyProfile (one element of a tidyLPA object), or when applied to a tidyLPA object of length one. It returns long format when applied to a tidyLPA object containing multiple tidyProfile analyses (because then the wide format does not make sense).

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_data-2Dedu_tidyLPA_issues_117-3Femail-5Fsource-3Dnotifications-26email-5Ftoken-3DABDCD5WJWOTI763FNR6RFJTP75WMXA5CNFSM4HQLRCO2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2GKC3A-23issuecomment-2D512532844&d=DwMCaQ&c=nE__W8dFE-shTxStwXtp0A&r=qjThNNzo58j8Je1aLG3p-w&m=qsLn8HXVhctcK8Ba0jgk5--jrZKk7tcF0yUY9tdyIZY&s=q72QwjYY2iohG2Njcmo2xW5WkJmuaauMaKicE4QV36k&e=, or mute the threadhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ABDCD5XCQRNJSVECLONM2GLP75WMXANCNFSM4HQLRCOQ&d=DwMCaQ&c=nE__W8dFE-shTxStwXtp0A&r=qjThNNzo58j8Je1aLG3p-w&m=qsLn8HXVhctcK8Ba0jgk5--jrZKk7tcF0yUY9tdyIZY&s=Tyk3qMAazvTqDtvi5H1ZPrsZ5GUDqvtbIDhitvm7gF0&e=.

jrosen48 commented 5 years ago

This was addressed in #122

benjaminwnelson commented 3 years ago

Can tidyLPA be used with repeated measures and if so, then should data be in long or wide format? Thanks!

cjvanlissa commented 3 years ago

@benjaminwnelson yes it can, but you can't fit a longitudinal model to the data. The data should be in wide format, multilevel mixture models are not available.

benjaminwnelson commented 3 years ago

Thank you! Would you advise that after pulling out LPA groups, you could then join that factor variable with a long dataset in order to do MLM modeling with the LPA group membership?

cjvanlissa commented 3 years ago

These are substantive questions; I would recommend finding a statistical collaborator who can help out with this. But briefly: The function ?get_data does what you're asking, but this approach ignores classification uncertainty. If entropy is > .90, the resulting bias may be negligible. In other cases, you will need to conduct 3-step analysis, which uses the classification probability matrix (see the literature).

benjaminwnelson commented 3 years ago

Yes, will definitely consult on this. Thank you!