ucl-pond / pySuStaIn

Subtype and Stage Inference (SuStaIn) algorithm with an example using simulated data.
MIT License
130 stars 63 forks source link

selecting the best number of subtypes #58

Closed xullllllll closed 1 month ago

xullllllll commented 3 months ago

Hi pySustaIn team, I had a problem with the two metrics(average test set log-likelihood for each subtype model and CVIC) for selecting the best number of subtypes. I have found some differences in the way they calculate, and sometimes the results of using them to select the best number of subtypes are different.For example, the result of my cross-validation is "Average test set log-likelihood for each subtype model: [-4803.6551647, -4799.39361827, -4802.32005004 ,-4802.00764201, -4803.9085804 ],CVIC for each subtype model:[48239.54786736 ,48206.06011098 ,48231.46430435, 47821.04024647,47840.63188379]".At this time, which metrics is more reasonable.

Thank you for your reply!