Different cur.type for metric and ordinal variables

brandmaier / semtree

Recursive Partitioning for Structural Equation Models

https://brandmaier.github.io/semtree/

GNU General Public License v3.0

13 stars 11 forks source link

Different cur.type for metric and ordinal variables #20

Open manuelarnold opened 3 years ago

manuelarnold commented 3 years ago

Currently, the cur.type is 1 for categorical variables and 2 for metric and ordinal variables. Since the distinction between ordinal and metric variables is important for both maxLR test statistics and score-based tests, it would make sense to use different cur.type values for both types of variables. 1: categorical 2: ordinal 3: metric

brandmaier commented 2 years ago

This is work-in-progress now.

brandmaier commented 2 years ago

Please see versions from https://github.com/brandmaier/semtree/commit/6e01466884c6dc5b0a5be16caaa6d645e3d09a02 and above. We now have pseudo-constants that can be returned to define scale of measurement. Please return the respective types from the score tests back to growTree(). The constants are defined in semtree-package.R as:

.SCALE_METRIC = 2
.SCALE_ORDINAL = 3
.SCALE_CATEGORICAL = 1

brandmaier commented 2 years ago

semtree now properly handles unordered and ordered factors but these changes broke score-tests for ordinal variables. I identified one possible problem in your code (https://github.com/brandmaier/semtree/commit/2d813e86085560be3ab4427c6f3c711d552e9274) but the score test still fails. Let me know what you need to know to fix this, @manuelarnold .

brandmaier commented 2 years ago

I tried to fix the issue in https://github.com/brandmaier/semtree/commit/d7b1247f02ba63f79bbfddca9f05bc8abcfda62b. I hope this is all that is needed. Please confirm.

brandmaier commented 1 year ago

@manuelarnold, could you please confirm that this is OK and then close the issue?

manuelarnold commented 1 year ago

There are some new changes related to this topic that we could discuss here: In my fork, I also distinguish between dummy (categorical variables with two levels) and multinomial variables (categorical variables with more than 2 levels). So, I would be in favor of separating nSCALE_CATEGORICAL into .SCALE_MULTINOMIAL and .SCALE_DUMMY. By the way, score-based testing of multinomial variables is now fully score-based and should be faster than the testing in the main branch.

brandmaier commented 1 year ago

@manuelarnold, how should we proceed with these changes? Would you want to prepare a pull request, so that I can check your proposed changes?

manuelarnold commented 1 year ago

I think these changes are already in the main branch. I will try to solve some conflicts in the next weeks and then we can start the process of synching the branches.