AnieBee / LCVAR

Other
6 stars 2 forks source link

LCVARclust Example

Anja Franziska Ernst
a.f.ernst[at]rug.nl
December 12, 2019



This document illustrates how to fit a latent class vector-autoregressive model to a time series using the LCVARclust R function.


For technical details see:

Ernst, A. F., Albers, C. J., Jeronimus, B. F. & Timmerman, M. E. (2020). Inter-individual differences in multivariate time series: Latent class vector-autoregressive modelling. European Journal of Psychological Assessment.


Please report bugs to: a.f.ernst[at]rug.nl.




Load the required packages

require(fastDummies) # (v.1.1.0) for dummy_cols()
require(MASS) # (v.7.3-49) for ginv()
require(mvtnorm) # (v.1.0-6) for dmvnorm()

Source all functions that are required to run LCVARclust

source("Functions/calculateA.R")
source("Functions/calculateB.R")
source("Functions/calculateBandWZero.R")
source("Functions/calculateCoefficientsForRandoAndRational.R")
source("Functions/calculateFYZ.R")
source("Functions/calculatePosterior.R")
source("Functions/calculateTau.R")
source("Functions/calculateRatio.R")
source("Functions/calculateNPara.R")
source("Functions/calculateIC.R")
source("Functions/calculateW.R")
source("Functions/calculateU.R")
source("Functions/calculateSigma.R")
source("Functions/calculateLagList.R")
source("Functions/callCalculateCoefficientsForRandoAndRational.R")
source("Functions/callEMFuncs.R")
source("Functions/checkComponentsCollapsed.R")
source("Functions/checkConvergence.R")
source("Functions/checkOutliers.R")
source("Functions/checkSingularitySigma.R")
source("Functions/constraintsOnB.R")
source("Functions/checkLikelihoodsNA.R")
source("Functions/checkPosteriorsNA.R")
source("Functions/createOutputList.R")
source("Functions/createX.R")
source("Functions/determineLagOrder.R")
source("Functions/reorderLags.R")
source("Functions/InitFuncs.R")
source("Functions/EMInit.R")
source("Functions/EMFunc.R")
source("Functions/LCVARclust.R")

Load the data

As an example we will analyze a generated data set "Dataset". Make sure no missing values are included in your dataframe, the LCVARclust function does not allow any missing values.

load("ExampleData/ExampleData.RData")
head(Dataset)
##         Item1    Item2     Item3     Item4 Person Timepoint
## [1,] 2.869908 3.755102  6.026770  6.868950      1         1
## [2,] 3.143062 5.199664  6.533799  9.052157      1         2
## [3,] 1.289307 3.893883  1.542538  4.086377      1         3
## [4,] 2.782861 6.461169  9.160752 13.078920      1         4
## [5,] 4.091992 5.336253  7.747059 10.067667      1         5
## [6,] 5.913368 5.124228 10.431553  9.968930      1         6
##      ContiniousVariable CategoricalVariableTimeOfDay
## [1,]          16.049110                            1
## [2,]           8.086699                            2
## [3,]          -1.524683                            3
## [4,]          40.518504                            1
## [5,]          24.089006                            2
## [6,]          22.051702                            3

Run the LCVARclust function

The following arguments can be specified:

Optional arguments:

For every fixed number of clusters as specified in Clusters, each combination of lag orders between LowestLag and HighestLag corresponds to a different statistical model. The models associated with all possible combinations of lag orders are estimated. For each model, the algorithm uses several EM-starts based on: pseudo-random initializations (Rand), a k-means based rational initialization (Rational), a guess at cluster membership (Initialization), and the use of a previous solution (always used). Every EM-start leads to one solution after several EM-iterations. Solutions are either reached because the likelihood converged (Conv) or because the maximum number of EM-iterations has been reached (it). Thus several solutions are reached for every possible statistical model. The ideal statistical model for a given number of clusters across all EM-starts and all lag combinations is determined with the information criterion specified in ICType. As a result, for every number of clusters specified in Clusters there will be one solution displayed in Result$BestSolutionsPerCluster.

Result = LCVARclust(Data = Dataset, yVars = 1:4, Time = 6, ID = 5, 
                    Covariates = "equal-within-clusters",
                    Clusters = 2, LowestLag = 1, HighestLag = 2, smallestClN = 3,
                    ICType = "HQ", seme = 3, Rand = 2, Rational = TRUE, 
                    SigmaIncrease = 10, it = 25, Conv = 1e-06, xContinuous = 7, xFactor = 8)
## 
##  2 Clusters: 
##   Lags: 2 2 
## * 1   * 2   * Rational   * Previous   
##   Lags: 1 2 
## * 1   
##  Warning: A single/empty cluster occured in EM-iteration 1 , memberships and Sigma reset 
## 
##  Warning: A single/empty cluster occured in EM-iteration 2 , memberships and Sigma reset 
## 
##  Warning: A single/empty cluster occured in EM-iteration 3 , memberships and Sigma reset 
## 
##  Warning: A single/empty cluster occured in EM-iteration 4 , memberships and Sigma reset 
## 
##  Warning: A single/empty cluster occured in EM-iteration 5 , memberships and Sigma reset 
## 
##  EM did not converge: Iteration terminated after reset stagnation 
## * 2   * Rational   * Previous   
##   Lags: 1 1 
## * 1   * 2   * Rational   * Previous

Interpret the output

Result contains two lists: BestSolutionsPerCluster which contains the ideal solution for each number of clusters, and AllSolutions which contains the solutions for all starts of all estimated models. The two lists are structured as follows:

The output for every solution contains:

Result$BestSolutionsPerCluster
## [[1]]
## [[1]]$Converged
## [1] TRUE
## 
## [[1]]$A
## , , 1
## 
##              Item1      Item2       Item3       Item4
## Item1  0.009113468  0.2712338 0.004804005 0.002756114
## Item2 -0.413364606  0.3943745 0.262143989 0.298928692
## Item3  0.600145815 -0.3102121 0.004383167 0.279625558
## Item4  0.311616188 -0.4046344 0.394373760 0.551653570
## 
## , , 2
## 
##             Item1      Item2     Item3     Item4
## Item1  0.08479586 -0.1033597 0.1068198 0.2466591
## Item2 -0.21764302  0.1776732 0.1040631 0.1632604
## Item3  0.29650512 -0.3046279 0.1888623 0.1450859
## Item4  0.40965841 -0.4093675 0.1878278 0.4930011
## 
## 
## [[1]]$B
## , , 1
## 
##          Intercept CategoricalVariableTimeOfDay_2
## Item1  0.017391963                       1.981168
## Item2  0.019302515                       2.456360
## Item3 -0.037663140                       3.014690
## Item4 -0.003983069                       3.497786
##       CategoricalVariableTimeOfDay_3 ContiniousVariable
## Item1                       2.967675         0.09994573
## Item2                       3.459679         0.20031637
## Item3                       4.032440         0.30097336
## Item4                       4.509843         0.39992009
## 
## , , 2
## 
##           Intercept CategoricalVariableTimeOfDay_2
## Item1 -0.0007224385                       2.001126
## Item2  0.0114308999                       2.481942
## Item3  0.0306677262                       2.991547
## Item4  0.0320145563                       3.482157
##       CategoricalVariableTimeOfDay_3 ContiniousVariable
## Item1                       3.030288          0.1997114
## Item2                       3.487059          0.3999603
## Item3                       3.963378          0.5994318
## Item4                       4.489226          0.7992137
## 
## 
## [[1]]$EMRepetitions
## [1] 6
## 
## [[1]]$last.loglik
## [1] -111428.9
## 
## [[1]]$nPara
## [1] 85
## 
## [[1]]$Sigma
## , , 1
## 
##           [,1]      [,2]      [,3]      [,4]
## [1,] 1.5043270 0.5095816 0.5275085 0.4806189
## [2,] 0.5095816 1.5078906 0.5317408 0.4927726
## [3,] 0.5275085 0.5317408 1.5341107 0.5158119
## [4,] 0.4806189 0.4927726 0.5158119 1.5036595
## 
## , , 2
## 
##           [,1]      [,2]      [,3]      [,4]
## [1,] 1.4764274 0.4891055 0.4956130 0.4745496
## [2,] 0.4891055 1.4769201 0.4882918 0.4775952
## [3,] 0.4956130 0.4882918 1.4914210 0.4712006
## [4,] 0.4745496 0.4775952 0.4712006 1.5006374
## 
## 
## [[1]]$LogLikelihood
##  [1] -141655.5 -133377.7 -118022.7 -111464.3 -111428.9 -111428.9        NA
##  [8]        NA        NA        NA        NA        NA        NA        NA
## [15]        NA        NA        NA        NA        NA        NA        NA
## [22]        NA        NA        NA        NA
## 
## [[1]]$EMiterationReset
##  [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [12] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [23] FALSE FALSE FALSE
## 
## [[1]]$PosteriorProbs
##        [,1] [,2]
##   [1,]    1    0
##   [2,]    1    0
##   [3,]    1    0
##   [4,]    0    1
##   [5,]    0    1
##   [6,]    0    1
##   [7,]    1    0
##   [8,]    1    0
##   [9,]    1    0
##  [10,]    1    0
##  [11,]    0    1
##  [12,]    1    0
##  [13,]    0    1
##  [14,]    1    0
##  [15,]    1    0
##  [16,]    0    1
##  [17,]    1    0
##  [18,]    1    0
##  [19,]    0    1
##  [20,]    0    1
##  [21,]    0    1
##  [22,]    1    0
##  [23,]    0    1
##  [24,]    1    0
##  [25,]    1    0
##  [26,]    1    0
##  [27,]    0    1
##  [28,]    1    0
##  [29,]    0    1
##  [30,]    0    1
##  [31,]    1    0
##  [32,]    0    1
##  [33,]    1    0
##  [34,]    0    1
##  [35,]    0    1
##  [36,]    1    0
##  [37,]    1    0
##  [38,]    0    1
##  [39,]    0    1
##  [40,]    0    1
##  [41,]    1    0
##  [42,]    0    1
##  [43,]    1    0
##  [44,]    0    1
##  [45,]    1    0
##  [46,]    1    0
##  [47,]    0    1
##  [48,]    0    1
##  [49,]    1    0
##  [50,]    1    0
##  [51,]    1    0
##  [52,]    1    0
##  [53,]    0    1
##  [54,]    1    0
##  [55,]    1    0
##  [56,]    0    1
##  [57,]    0    1
##  [58,]    1    0
##  [59,]    0    1
##  [60,]    0    1
##  [61,]    0    1
##  [62,]    1    0
##  [63,]    0    1
##  [64,]    1    0
##  [65,]    1    0
##  [66,]    0    1
##  [67,]    0    1
##  [68,]    1    0
##  [69,]    1    0
##  [70,]    1    0
##  [71,]    1    0
##  [72,]    0    1
##  [73,]    1    0
##  [74,]    1    0
##  [75,]    1    0
##  [76,]    1    0
##  [77,]    1    0
##  [78,]    1    0
##  [79,]    0    1
##  [80,]    1    0
##  [81,]    0    1
##  [82,]    1    0
##  [83,]    0    1
##  [84,]    0    1
##  [85,]    1    0
##  [86,]    0    1
##  [87,]    0    1
##  [88,]    0    1
##  [89,]    0    1
##  [90,]    1    0
##  [91,]    0    1
##  [92,]    1    0
##  [93,]    1    0
##  [94,]    0    1
##  [95,]    0    1
##  [96,]    0    1
##  [97,]    1    0
##  [98,]    0    1
##  [99,]    0    1
## [100,]    1    0
## [101,]    1    0
## [102,]    0    1
## [103,]    0    1
## [104,]    0    1
## [105,]    1    0
## [106,]    0    1
## [107,]    1    0
## [108,]    0    1
## [109,]    0    1
## [110,]    0    1
## [111,]    0    1
## [112,]    0    1
## [113,]    1    0
## [114,]    0    1
## [115,]    1    0
## [116,]    0    1
## [117,]    1    0
## [118,]    0    1
## [119,]    1    0
## [120,]    0    1
## 
## [[1]]$Lags
## [1] 1 1
## 
## [[1]]$Classification
##   [1] 1 1 1 2 2 2 1 1 1 1 2 1 2 1 1 2 1 1 2 2 2 1 2 1 1 1 2 1 2 2 1 2 1 2 2
##  [36] 1 1 2 2 2 1 2 1 2 1 1 2 2 1 1 1 1 2 1 1 2 2 1 2 2 2 1 2 1 1 2 2 1 1 1
##  [71] 1 2 1 1 1 1 1 1 2 1 2 1 2 2 1 2 2 2 2 1 2 1 1 2 2 2 1 2 2 1 1 2 2 2 1
## [106] 2 1 2 2 2 2 2 1 2 1 2 1 2 1 2
## 
## [[1]]$IC
## [1] 1.111175
## 
## [[1]]$SC
## [1] 1.135838
## 
## [[1]]$Proportions
##      .data_1 .data_2
## [1,]     0.5     0.5
Result$AllSolutions[[1]][[2]][[4]]
## $Converged
## [1] TRUE
## 
## $A
## , , 1
## 
##            Item1      Item2        Item3       Item4        Item1
## Item1  0.0106515  0.2671153  0.012149280 0.004481859 -0.018124224
## Item2 -0.4146793  0.3869261  0.264064519 0.300979921 -0.005133318
## Item3  0.6034681 -0.2958824 -0.006513921 0.280847617  0.024730087
## Item4  0.3108251 -0.3988043  0.392106037 0.554954640  0.007354099
##              Item2        Item3        Item4
## Item1  0.005448890  0.008323587 -0.008064031
## Item2  0.011581148 -0.004005090  0.004490731
## Item3 -0.018431289 -0.021319615  0.004455396
## Item4 -0.001376256  0.008704764 -0.015556552
## 
## , , 2
## 
##             Item1      Item2     Item3     Item4 Item1 Item2 Item3 Item4
## Item1  0.08479586 -0.1033597 0.1068198 0.2466591    NA    NA    NA    NA
## Item2 -0.21764302  0.1776732 0.1040631 0.1632604    NA    NA    NA    NA
## Item3  0.29650512 -0.3046279 0.1888623 0.1450859    NA    NA    NA    NA
## Item4  0.40965841 -0.4093675 0.1878278 0.4930011    NA    NA    NA    NA
## 
## 
## $B
## , , 1
## 
##          Intercept CategoricalVariableTimeOfDay_2
## Item1  0.015228873                       1.982025
## Item2  0.018958818                       2.452789
## Item3 -0.036192189                       3.013027
## Item4 -0.002321288                       3.494736
##       CategoricalVariableTimeOfDay_3 ContiniousVariable
## Item1                       2.967232          0.1000219
## Item2                       3.457505          0.2003108
## Item3                       4.033086          0.3009098
## Item4                       4.508692          0.3998918
## 
## , , 2
## 
##           Intercept CategoricalVariableTimeOfDay_2
## Item1 -0.0007224385                       2.001126
## Item2  0.0114308999                       2.481942
## Item3  0.0306677262                       2.991547
## Item4  0.0320145563                       3.482157
##       CategoricalVariableTimeOfDay_3 ContiniousVariable
## Item1                       3.030288          0.1997114
## Item2                       3.487059          0.3999603
## Item3                       3.963378          0.5994318
## Item4                       4.489226          0.7992137
## 
## 
## $EMRepetitions
## [1] 2
## 
## $last.loglik
## [1] -111067.5
## 
## $nPara
## [1] 101
## 
## $Sigma
## , , 1
## 
##           [,1]      [,2]      [,3]      [,4]
## [1,] 1.5055963 0.5099283 0.5263043 0.4817252
## [2,] 0.5099283 1.5101683 0.5321159 0.4944527
## [3,] 0.5263043 0.5321159 1.5338779 0.5166816
## [4,] 0.4817252 0.4944527 0.5166816 1.5049912
## 
## , , 2
## 
##           [,1]      [,2]      [,3]      [,4]
## [1,] 1.4764274 0.4891055 0.4956130 0.4745496
## [2,] 0.4891055 1.4769201 0.4882918 0.4775952
## [3,] 0.4956130 0.4882918 1.4914210 0.4712006
## [4,] 0.4745496 0.4775952 0.4712006 1.5006374
## 
## 
## $LogLikelihood
##  [1] -111067.5 -111067.5        NA        NA        NA        NA        NA
##  [8]        NA        NA        NA        NA        NA        NA        NA
## [15]        NA        NA        NA        NA        NA        NA        NA
## [22]        NA        NA        NA        NA
## 
## $EMiterationReset
##  [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [12] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [23] FALSE FALSE FALSE
## 
## $PosteriorProbs
##        [,1] [,2]
##   [1,]    1    0
##   [2,]    1    0
##   [3,]    1    0
##   [4,]    0    1
##   [5,]    0    1
##   [6,]    0    1
##   [7,]    1    0
##   [8,]    1    0
##   [9,]    1    0
##  [10,]    1    0
##  [11,]    0    1
##  [12,]    1    0
##  [13,]    0    1
##  [14,]    1    0
##  [15,]    1    0
##  [16,]    0    1
##  [17,]    1    0
##  [18,]    1    0
##  [19,]    0    1
##  [20,]    0    1
##  [21,]    0    1
##  [22,]    1    0
##  [23,]    0    1
##  [24,]    1    0
##  [25,]    1    0
##  [26,]    1    0
##  [27,]    0    1
##  [28,]    1    0
##  [29,]    0    1
##  [30,]    0    1
##  [31,]    1    0
##  [32,]    0    1
##  [33,]    1    0
##  [34,]    0    1
##  [35,]    0    1
##  [36,]    1    0
##  [37,]    1    0
##  [38,]    0    1
##  [39,]    0    1
##  [40,]    0    1
##  [41,]    1    0
##  [42,]    0    1
##  [43,]    1    0
##  [44,]    0    1
##  [45,]    1    0
##  [46,]    1    0
##  [47,]    0    1
##  [48,]    0    1
##  [49,]    1    0
##  [50,]    1    0
##  [51,]    1    0
##  [52,]    1    0
##  [53,]    0    1
##  [54,]    1    0
##  [55,]    1    0
##  [56,]    0    1
##  [57,]    0    1
##  [58,]    1    0
##  [59,]    0    1
##  [60,]    0    1
##  [61,]    0    1
##  [62,]    1    0
##  [63,]    0    1
##  [64,]    1    0
##  [65,]    1    0
##  [66,]    0    1
##  [67,]    0    1
##  [68,]    1    0
##  [69,]    1    0
##  [70,]    1    0
##  [71,]    1    0
##  [72,]    0    1
##  [73,]    1    0
##  [74,]    1    0
##  [75,]    1    0
##  [76,]    1    0
##  [77,]    1    0
##  [78,]    1    0
##  [79,]    0    1
##  [80,]    1    0
##  [81,]    0    1
##  [82,]    1    0
##  [83,]    0    1
##  [84,]    0    1
##  [85,]    1    0
##  [86,]    0    1
##  [87,]    0    1
##  [88,]    0    1
##  [89,]    0    1
##  [90,]    1    0
##  [91,]    0    1
##  [92,]    1    0
##  [93,]    1    0
##  [94,]    0    1
##  [95,]    0    1
##  [96,]    0    1
##  [97,]    1    0
##  [98,]    0    1
##  [99,]    0    1
## [100,]    1    0
## [101,]    1    0
## [102,]    0    1
## [103,]    0    1
## [104,]    0    1
## [105,]    1    0
## [106,]    0    1
## [107,]    1    0
## [108,]    0    1
## [109,]    0    1
## [110,]    0    1
## [111,]    0    1
## [112,]    0    1
## [113,]    1    0
## [114,]    0    1
## [115,]    1    0
## [116,]    0    1
## [117,]    1    0
## [118,]    0    1
## [119,]    1    0
## [120,]    0    1
## 
## $Lags
## [1] 2 1
## 
## $Classification
##   [1] 1 1 1 2 2 2 1 1 1 1 2 1 2 1 1 2 1 1 2 2 2 1 2 1 1 1 2 1 2 2 1 2 1 2 2
##  [36] 1 1 2 2 2 1 2 1 2 1 1 2 2 1 1 1 1 2 1 1 2 2 1 2 2 2 1 2 1 1 2 2 1 1 1
##  [71] 1 2 1 1 1 1 1 1 2 1 2 1 2 2 1 2 2 2 2 1 2 1 1 2 2 2 1 2 2 1 1 2 2 2 1
## [106] 2 1 2 2 2 2 2 1 2 1 2 1 2 1 2
## 
## $IC
## [1] 1.116614
## 
## $SC
## [1] 1.153753
## 
## $Proportions
##      .data_1 .data_2
## [1,]     0.5     0.5