Restrict possible causes

Some network inference algorithms/packages allow you to restrict the set of possible causes, decreasing the chance of spurious causes, while also decreasing the space/time complexity by significantly reducing the problem search space. One example is the RTN package, based on the ARACNe algorithm. This is common for gene regulatory networks, for example, by assuming that only genes that code for transcription factors can cause an increase or decrease in expression of other genes.

A possible implementation for this feature would be an extra column in the category order file stating, for each variable, if it can be a cause of others. Then, both in the skeleton and orientation learning steps this should be taken into consideration. There shouldn't be an edge between 2 non transcription factors, for example. This spurious edge, if existed, could be due to a latent confounder or spurious separation set, for example. A v-structure $A \rightarrow B \leftarrow C$ shouldn't be possible either if A and C are not transcription factors. Restricting possible causes, thus, leads to less spurious edges, spurious orientations and edges in general.

miicTeam / miic_R_package

Restrict possible causes #118