PoliUniLu / cora

This repository contains all files for the CORA software package.
GNU General Public License v3.0
5 stars 2 forks source link

CORA

Tests codecov License

DOI status

Open in Colab

logo

CORA is a Python library for Combinational Regularity Analysis (CORA).

Description

Combinational Regularity Analysis (CORA) is a member of the family of Configurational Comparative Methods (CCMs; Thiem et al. 2022). It is thus closely related to Qualitative Comparative Analysis (QCA; Ragin 1987) and Coincidence Analysis (CNA; Baumgartner 2009; Baumgartner and Ambühl 2020). Modern CCMs seek to detect INUS structures in data (Thiem 2017, 2022). Such structures are elaborate cause-effect relations that can be represented in the Boolean language of propositional logic (Baumgartner 2008; Mackie 1965; Psillos 2009). Most importantly, these relations are marked by causal conjunctivity (e.g., $a$ and not $b$ and $c$ and $\cdots$) and causal disjunctivity (e.g., $d$ or $e$ or $f$ or $\cdots$). For this reason, CCMs differ fundamentally from most other empirical research methods (Thiem et al. 2016).

In contrast to QCA and CNA, however, CORA has been inspired by switching circuit analysis, a subfield of electrical engineering. INUS structures and switching circuits have much in common because propositional logic - the language of INUS causation - and switching algebra - the language of switching circuit analysis - are operationally equivalent branches of the same underlying Boolean algebra (Lewin and Protheroe 1992). It is therefore no coincidence that one of the first systematic algorithms for Boolean optimization - the Quine-McCluskey algorithm (McCluskey 1956; Quine 1955) - had been co-developed by an analytical philosopher (Willard Van Orman Quine) and an electrical engineer (Edward J. McCluskey).

Most importantly, CORA is currently the only CCM able to analyze INUS structures that simultaneously feature simple as well as complex effects (e.g., $y$ and not $z$, not $y$ and $z$, $y$ and $z$). CORA can process such structures even in multi-value form (Mkrtchyan et al. 2023). In addition, CORA offers a configurational version of Occam's Razor: a data-mining approach to solution building that reduces model ambiguities by keeping the number of required variables for finding a solution at a minimum. Lastly, CORA includes a lean yet powerful visualization module called LOGIGRAM, with which two-level logic diagrams can be produced from any (system of) Boolean or multi-value function(s) in disjunctive normal form. Logic diagrams considerably outperfrom Venn diagrams, which are often used in QCA, when it comes to the representation and interpretability of INUS structures (Thiem et al. 2023). As a method, CORA is implemented by the duo of Python packages CORA and LOGIGRAM (Sebechlebská et al. 2023).

Installation

CORA requires Python>3.7 and uses Poetry for dependendecy management. Use the package manager pip to install CORA, including all dependencies.

pip install git+https://github.com/PoliUniLu/cora.git

It is recommended to install the package into a dedicated virtual environment.

Google Colab

To open CORA with a graphical interface in Google Colab, click the button below:

Open in Colab

Usage

The main features of the package are part of the OptimizationContext class, including functions:

Note: Use the help function to access the documentation of CORA.

Example:

import pandas as pd
from cora import OptimizationContext

df = pd.DataFrame([[1,1,0,1],
                   [0,0,1,1],
                   [1,0,1,0],
                   [0,1,0,1]], columns=["A","B","C","OUT"])

context = OptimizationContext(data = df, output_labels = ["OUT"])
PIs = context.get_prime_implicants() # result: {B, c, #a}; essential prime implicants marked by hashtags
irredundant_solutions = context.get_irredundant_sums() # result: [M1: #a + B, M2: #a + c]

Configurational data-mining is another feature. It analyzes all n-tuples of input combinations to search for feasible tuples of solution-generating inputs. In essence, this feature thus provides a configurational version of Occam's Razor (Feldman 2016).

Example:

import pandas as pd
import cora

df = pd.DataFrame([[1,2,0,1,1],
                   [1,1,1,0,1],
                   [0,2,1,0,0],
                   [0,2,2,0,1],], columns=["A","B","C","D","OUT"])
result = cora.data_mining(df, ["OUT"], len_of_tuple = 2, inc_score1 = 0.5, n_cut = 1)
result # print(result.to_markdown())

|    | Combination   |   Nr_of_systems |   Inc_score |   Cov_score |   Score |
|---:|:--------------|----------------:|------------:|------------:|--------:|
|  0 | ['A', 'B']    |               1 |        0.75 |           1 |    0.75 |
|  1 | ['A', 'C']    |               1 |        1    |           1 |    1    |
|  2 | ['A', 'D']    |               1 |        0.75 |           1 |    0.75 |
|  3 | ['B', 'C']    |               1 |        1    |           1 |    1    |
|  4 | ['B', 'D']    |               1 |        0.75 |           1 |    0.75 |
|  5 | ['C', 'D']    |               1 |        0.75 |           1 |    0.75 |

To access more examples, see the /examples folder or follow Open in Colab

Citation Info

When using CORA (method and software), please cite it as follows:

Method:

Thiem, Alrik, Lusine Mkrtchyan, and Zuzana Sebechlebská. 2022. "Combinational Regularity Analysis (CORA) - A New Method for Uncovering Complex Causation in Medical and Health Research." BMC Medical Research Methodology 22 (1):333. Link

Software:

Sebechlebská, Zuzana, Lusine Mkrtchyan and Alrik Thiem. 2023. CORA and LOGIGRAM: A Duo of Python Packages for Combinational Regularity Analysis (CORA)." JOSS: The Journal of Open Source Software 8 (85):5019. Link

Copyright

CORA is licensed under GNU GPLv3.

Contributions

We highly welcome contributions from the community. Feedback, bug reports, and feature requests should be placed as an Issue on GitHub.

Pull requests

To set up a development environment, use Poetry.

pip install poetry
poetry install

Test the code by running

poetry run pytest

Pull requests are welcome. Note that, although the current codebase does not have an entirely consistent code style, the new code should be PEP-8 compliant.

References