AI-SDC / ACRO

Tools for the Automatic Checking of Research Outputs. These are the tools for researchers to use as drop-in replacements for commands that produce outputs in Stata Python and R
MIT License
15 stars 2 forks source link
data-privacy data-protection privacy privacy-tools statistical-disclosure-control

ACRO: Tools for the Automatic Checking of Research Outputs

DOI PyPI package Python versions Codacy codecov

This repository holds the Python ACRO package. An R wrapper package is available: ACRO-R.

A GUI for viewing and approving outputs is also available: SACRO-Viewer

ACRO (Automatic Checking of Research Outputs) is an open source tool for automating the statistical disclosure control (SDC) of research outputs. ACRO assists researchers and output checkers by distinguishing between research output that is safe to publish, output that requires further analysis, and output that cannot be published because of a substantial risk of disclosing private data.

It does this by providing a lightweight 'skin' that sits over well-known analysis tools, in a variety of languages researchers might use. This adds functionality to:

ACRO workflow and architecture schematic

Installation

ACRO can be installed via PyPI.

If installed in this way, the example notebooks and the data files used therein will need to be copied from the repository.

$ pip install acro

Notes for Python 3.13

ACRO currently depends on numpy version 1.x.x for which no pre-compiled wheels are available within pip for Python 3.13. Therefore, in this scenario, numpy must be built from source. This requires the installation of a C++ compiler before pip installing acro.

For Windows, the Microsoft Visual Studio C++ build tools will likely need to be installed first.

Examples

See the example notebooks for:

Documentation

The github-pages contains pre-built documentation.

Training Materials

For training videos about ACRO, see training videos.

Contributing

See CONTRIBUTING.md

Acknowledgement

This work was funded by UK Research and Innovation under Grant Number MC_PC_23006 as part of Phase 1 of the DARE UK (Data and Analytics Research Environments UK) programme, delivered in partnership with Health Data Research UK (HDR UK) and Administrative Data Research UK (ADR UK). The specific project was Semi-Automatic Checking of Research Outputs (SACRO).