bips-hb / micd

Multiple Imputation in Causal Graph Discovery
GNU General Public License v3.0
2 stars 0 forks source link

MICD and Ordinal Data #6

Closed MichielBosma closed 2 years ago

MichielBosma commented 2 years ago

Hello,

I am trying to perform causal discovery using your package. However, I am struggling with the following question. My data comes from a huge survey and is therefore not continuous but ordinal. The order matters a lot, however, with subjects filling in a ranked scale. I wanted to first perform multiple imputation and then causal discovery.

When I first tried this with MixMitest after MI with the data as numeric the algorithm is able to retrieve a detailed graph from the data. However, when I make the variables ordered factors, the algorithm struggles to find anything. Now I find in the details of the function that it relies on mixCItest. There it states: that the variable can handle mixed (continuous and unordered categorical) variables. It seems therefore that the algorithm is not able to see the order in the factor data and therefore fails in testing the conditional independence tests.

Currently, I am wondering, if there is a way of performing the algorithm on especially ordered discrete data. Or it the best suggestion to regard the variables as numeric in a sense, since then the order is preserved?

Thank you!

Michiel

jawitte commented 2 years ago

Hi Michiel,

You are right, the order of ordered factors is not taken into account by the algorithm. It is assumed that the variables coded as factors are unordered categorical, and the variables coded as numeric follow a joint normal distribution within each combination of levels of the factor variables. I agree that coding the variables as numeric and using either mixMItest or gaussMItest (they should yield very similar results in this case) is probably preferable.

I am not aware of any alternative package for handling multiply imputed data in causal discovery. However, if you think that test-wise deletion of incomplete rows might be a suitable appraoch for your data, you might want to give the bnlearn package a try, which has a conditional independence test for ordered categorial variables, the Jonckheere-Terpstra test.

Best wishes, Janine

MichielBosma commented 2 years ago

Hi Janine,

Thank you for your response! I think I will then use the data as numeric, since that comes closer to the original meaning of the data. I am also not aware of any other package to use for this as well. The test-wise deletion is I think not really a suitable approach for my data, since there is a clear missingness pattern in the data, which can be best resolved I believe through multiple imputation.

Thank you again, Michiel