Find the complete Documentantion https://etia.readthedocs.io/en/latest/index.html#
ETIA (Αιτία (pronounced etía): "cause" in Greek) is a cutting-edge automated causal discovery library that takes causal analysis beyond traditional methods. It is designed to tackle complex, real-world problems by automating the entire causal discovery process, offering a combination of feature selection, causal structure learning, and causal reasoning validation that is unmatched in other libraries.
ETIA provides:
Unlike existing libraries, ETIA does not simply offer isolated algorithms—it provides a fully automated pipeline that optimizes and customizes each step of the causal discovery process, ensuring the results are robust, interpretable, and reliable for both researchers and industrial practitioners.
ETIA goes beyond other causal discovery libraries by offering:
AFS goes beyond standard feature selection by targeting the Markov Boundary of the outcome of interest. This approach ensures that you work only with the most causally relevant variables, preventing the noise and redundancy that plague other methods.
Algorithm | Description | Data Type |
---|---|---|
FBED |
Forward-Backward selection with Early Dropping | Mixed |
SES |
Statistical Equivalence Selection | Mixed |
Why It Matters: In high-dimensional datasets, choosing the right features is critical. AFS uses state-of-the-art techniques to find causally relevant features, ensuring that the subsequent causal analysis is accurate and manageable, even in datasets with hundreds of variables.
The CL module identifies causal relationships using a variety of algorithms that are automatically optimized to your data. ETIA's causal tuning mechanism guarantees the selection of the best possible causal structure without the need for user intervention. | Algorithm | Latent Variables Supported | Tests/Scores Used | Data Type |
---|---|---|---|---|
PC Algorithm | ✕ | FisherZ, CG LRT, DG LRT, Chi-square, G-square | Continuous, Mixed, Categorical | |
CPC | ✕ | FisherZ, CG LRT, DG LRT, Chi-square, G-square | Continuous, Mixed, Categorical | |
FGES | ✓ | SEM BIC Score, BDeu, Discrete BIC, CG BIC, DG BIC | Continuous, Mixed, Categorical | |
FCI | ✓ | FisherZ, CG LRT, DG LRT, Chi-square, G-square | Continuous, Mixed, Categorical | |
FCI-Max | ✓ | FisherZ, CG LRT, DG LRT, Chi-square, G-square | Continuous, Mixed, Categorical | |
RFCI | ✓ | FisherZ, CG LRT, DG LRT, Chi-square, G-square | Continuous, Mixed, Categorical | |
GFCI | ✓ | FisherZ, CG LRT, DG LRT, Chi-square, G-square | Continuous, Mixed, Categorical | |
CFCI | ✓ | FisherZ, CG LRT, DG LRT, Chi-square, G-square | Continuous, Mixed, Categorical | |
sVAR-FCI | ✓ | FisherZ, CG LRT, DG LRT, Chi-square, G-square | Continuous, Mixed, Categorical (Time Series) | |
svargFCI | ✓ | FisherZ, CG LRT, DG LRT, Chi-square, G-square, SEM BIC Score, BDeu, Discrete BIC, CG BIC, DG BIC | Continuous, Mixed, Categorical (Time Series) | |
PCMCI | ✕ | ParCor, RobustParCor, GPDC, CMIknn, ParCorrWLS, Gsquared, CMIsymb, RegressionCI | Continuous, Mixed, Categorical (Time Series) | |
PCMCI+ | ✕ | ParCor, RobustParCor, GPDC, CMIknn, ParCorrWLS, Gsquared, CMIsymb, RegressionCI | Continuous, Mixed, Categorical (Time Series) | |
LPCMCI | ✓ | ParCor, RobustParCor, GPDC, CMIknn, ParCorrWLS, Gsquared, CMIsymb, RegressionCI | Continuous, Mixed, Categorical (Time Series) | |
SAM | ✕ | Learning Rate (lr ), Decay Learning Rate (dlr ), Regularization (lambda1 , lambda2 ), Hidden Neurons (nh , dnh ), Training Epochs (train_epochs ), Testing Epochs (test_epochs ), Batch Size (batch_size ), Loss Type (losstype ) |
Continuous, Mixed | |
NOTEARS | ✕ | Max Iterations (max_iter ), Tolerance (h_tol ), Threshold (threshold ) |
Continuous, Mixed, Categorical |
Latent Variables Supported:
Tests/Scores Used:
ci_test
): Methods like FisherZ, CG LRT, DG LRT, Chi-square, G-square.score
): Metrics like SEM BIC Score, BDeu, Discrete BIC, CG BIC, DG BIC.Data Type:
False
, the algorithm accounts for potential latent variables.Why It Matters: Traditional causal discovery libraries expect users to manually choose an algorithm. ETIA’s automated pipeline selects the best algorithm for your dataset, saving time and reducing the risk of suboptimal results. Additionally, it can handle datasets with latent variables, which most other systems cannot do.
CRV provides advanced tools to evaluate the discovered causal graph, offering confidence estimates and comprehensive visualizations. It can answer specific causal queries, making it an invaluable tool for decision-makers and researchers alike.
Functionality | Description |
---|---|
Visualization |
Visualize graphs and causal relations using Cytoscape. |
Adjustment Sets |
Identify adjustment sets needed for estimating causal effects. |
Confidence Calculations |
Assess confidence in discovered causal relationships through bootstrapping methods. |
Causal Queries |
Answer user-defined causal queries, including directed, bidirected, and potentially directed paths between variables. |
Why It Matters: The ability to compute and visualize confidence in causal relationships sets ETIA apart from other libraries. Users can trust that the discovered causal graph is not just a hypothesis but a statistically backed structure with clearly defined confidence levels.
You can install ETIA directly from PyPi using pip:
pip install etia
Alternatively, clone the repository and install the dependencies:
git clone <repository-url>
cd library directory
pip install -r requirements.txt
make all
Before installing ETIA, ensure that you have the following dependencies:
You can download and install these dependencies from their official websites:
Once ETIA is available on PyPi, you will be able to install it directly using pip
:
pip install etia
To install ETIA from the source code, follow the steps below:
git clone <repository-url>
cd etia
pip install -r requirements.txt
make all
For using Tetrad algorithms and certain feature selection algorithms, ensure that Java and R are correctly installed.
JAVA_HOME
environment variable is set: export JAVA_HOME=/path/to/java
R --version
Rscipt --vanilla "install.packages("MXM", repos = "http://cran.us.r-project.org")"
Rscipt --vanilla "install.packages("daggity", repos = "http://cran.us.r-project.org")"
Once installed, ETIA can be used by importing its modules. Here is a simple example for feature selection and causal discovery:
from ETIA.AFS import AFS
from ETIA.CausalLearning import CausalLearner
# Feature selection
afs = AFS()
results = afs.select_features(dataset_file_path, targets)
reduced_datset = results['reduced_data']
# Causal discovery
cl = CausalLearner(reduced_datset)
results = cl.learn_model()
You can run the test suite using:
pytest tests/
Make sure that all dependencies, including Java and R, are correctly installed before running tests.
Contributions are welcome! To contribute:
git checkout -b feature-branch
).git commit -am 'Add new feature'
).git push origin feature-branch
).This project is licensed under the MIT License. See the LICENSE file for details.