CORA is a Python library for Combinational Regularity Analysis (CORA).
Combinational Regularity Analysis (CORA) is a member of the family of Configurational Comparative Methods (CCMs; Thiem et al. 2022). It is thus closely related to Qualitative Comparative Analysis (QCA; Ragin 1987) and Coincidence Analysis (CNA; Baumgartner 2009; Baumgartner and Ambühl 2020). Modern CCMs seek to detect INUS structures in data (Thiem 2017, 2022). Such structures are elaborate cause-effect relations that can be represented in the Boolean language of propositional logic (Baumgartner 2008; Mackie 1965; Psillos 2009). Most importantly, these relations are marked by causal conjunctivity (e.g., $a$ and not $b$ and $c$ and $\cdots$) and causal disjunctivity (e.g., $d$ or $e$ or $f$ or $\cdots$). For this reason, CCMs differ fundamentally from most other empirical research methods (Thiem et al. 2016).
In contrast to QCA and CNA, however, CORA has been inspired by switching circuit analysis, a subfield of electrical engineering. INUS structures and switching circuits have much in common because propositional logic - the language of INUS causation - and switching algebra - the language of switching circuit analysis - are operationally equivalent branches of the same underlying Boolean algebra (Lewin and Protheroe 1992). It is therefore no coincidence that one of the first systematic algorithms for Boolean optimization - the Quine-McCluskey algorithm (McCluskey 1956; Quine 1955) - had been co-developed by an analytical philosopher (Willard Van Orman Quine) and an electrical engineer (Edward J. McCluskey).
Most importantly, CORA is currently the only CCM able to analyze INUS structures that simultaneously feature simple as well as complex effects (e.g., $y$ and not $z$, not $y$ and $z$, $y$ and $z$). CORA can process such structures even in multi-value form (Mkrtchyan et al. 2023). In addition, CORA offers a configurational version of Occam's Razor: a data-mining approach to solution building that reduces model ambiguities by keeping the number of required variables for finding a solution at a minimum. Lastly, CORA includes a lean yet powerful visualization module called LOGIGRAM, with which two-level logic diagrams can be produced from any (system of) Boolean or multi-value function(s) in disjunctive normal form. Logic diagrams considerably outperfrom Venn diagrams, which are often used in QCA, when it comes to the representation and interpretability of INUS structures (Thiem et al. 2023). As a method, CORA is implemented by the duo of Python packages CORA
and LOGIGRAM
(Sebechlebská et al. 2023).
CORA requires Python>3.7 and uses Poetry for dependendecy management. Use the package manager pip to install CORA, including all dependencies.
pip install git+https://github.com/PoliUniLu/cora.git
It is recommended to install the package into a dedicated virtual environment.
To open CORA with a graphical interface in Google Colab, click the button below:
The main features of the package are part of the OptimizationContext
class, including functions:
get_prime_implicants
,prime_implicant_chart
,get_irredundant_systems
,get_irredundant_solutions
.Note:
Use the help
function to access the documentation of CORA.
Example:
import pandas as pd
from cora import OptimizationContext
df = pd.DataFrame([[1,1,0,1],
[0,0,1,1],
[1,0,1,0],
[0,1,0,1]], columns=["A","B","C","OUT"])
context = OptimizationContext(data = df, output_labels = ["OUT"])
PIs = context.get_prime_implicants() # result: {B, c, #a}; essential prime implicants marked by hashtags
irredundant_solutions = context.get_irredundant_sums() # result: [M1: #a + B, M2: #a + c]
Configurational data-mining is another feature. It analyzes all n-tuples of input combinations to search for feasible tuples of solution-generating inputs. In essence, this feature thus provides a configurational version of Occam's Razor (Feldman 2016).
Example:
import pandas as pd
import cora
df = pd.DataFrame([[1,2,0,1,1],
[1,1,1,0,1],
[0,2,1,0,0],
[0,2,2,0,1],], columns=["A","B","C","D","OUT"])
result = cora.data_mining(df, ["OUT"], len_of_tuple = 2, inc_score1 = 0.5, n_cut = 1)
result # print(result.to_markdown())
| | Combination | Nr_of_systems | Inc_score | Cov_score | Score |
|---:|:--------------|----------------:|------------:|------------:|--------:|
| 0 | ['A', 'B'] | 1 | 0.75 | 1 | 0.75 |
| 1 | ['A', 'C'] | 1 | 1 | 1 | 1 |
| 2 | ['A', 'D'] | 1 | 0.75 | 1 | 0.75 |
| 3 | ['B', 'C'] | 1 | 1 | 1 | 1 |
| 4 | ['B', 'D'] | 1 | 0.75 | 1 | 0.75 |
| 5 | ['C', 'D'] | 1 | 0.75 | 1 | 0.75 |
To access more examples, see the /examples
folder or follow
When using CORA (method and software), please cite it as follows:
Method:
Thiem, Alrik, Lusine Mkrtchyan, and Zuzana Sebechlebská. 2022. "Combinational Regularity Analysis (CORA) - A New Method for Uncovering Complex Causation in Medical and Health Research." BMC Medical Research Methodology 22 (1):333. Link
Software:
Sebechlebská, Zuzana, Lusine Mkrtchyan and Alrik Thiem. 2023. CORA and LOGIGRAM: A Duo of Python Packages for Combinational Regularity Analysis (CORA)." JOSS: The Journal of Open Source Software 8 (85):5019. Link
CORA is licensed under GNU GPLv3.
We highly welcome contributions from the community. Feedback, bug reports, and feature requests should be placed as an Issue on GitHub.
To set up a development environment, use Poetry.
pip install poetry
poetry install
Test the code by running
poetry run pytest
Pull requests are welcome. Note that, although the current codebase does not have an entirely consistent code style, the new code should be PEP-8 compliant.