Open, Reproducible, and Decentralized Workflows with COINSTAC
By Eric Verner, Center for Translational Research in Neuroimaging and Data Science
Theme: Open Workflows
Format: Software/process demo
Abstract
While the world is moving towards open software, open science, and open data, there are still many datasets that cannot be shared because of privacy issues. These issues are further exacerbated by differences in data sharing policy around the world. However, the benefits of adding data to a statistical analysis or augmenting training data for a machine learning model are clear. Although a meta-analysis can be performed, procedures to collect, label, process, and analyze data are often heterogeneous, especially among collaborators who have not planned on combining data ahead of time. To solve these problems and enable collaboration between researchers in different groups and even different countries, we introduce the Collaborative Informatics and Neuroimaging Suite Toolkit for Anonymous Computation (COINSTAC), a free and open-source platform. COINSTAC allows researchers to perform identical preprocessing, statistical analysis, and machine learning across multiple sites in real time. All operations are encapsulated inside Docker containers, which are open and available for inspection on DockerHub, enabling reproducibility. Additionally, all code inside of the Docker containers is open source and available on GitHub. COINSTAC offers reproducible and open-source computations for voxel-based morphometry, fMRI preprocessing, regression, classification, Group ICA, functional network connectivity, tSNE, and more. In the demo, we will show how to use COINSTAC to run reproducible, open, decentralized workflows.
Open, Reproducible, and Decentralized Workflows with COINSTAC
By Eric Verner, Center for Translational Research in Neuroimaging and Data Science
Abstract
While the world is moving towards open software, open science, and open data, there are still many datasets that cannot be shared because of privacy issues. These issues are further exacerbated by differences in data sharing policy around the world. However, the benefits of adding data to a statistical analysis or augmenting training data for a machine learning model are clear. Although a meta-analysis can be performed, procedures to collect, label, process, and analyze data are often heterogeneous, especially among collaborators who have not planned on combining data ahead of time. To solve these problems and enable collaboration between researchers in different groups and even different countries, we introduce the Collaborative Informatics and Neuroimaging Suite Toolkit for Anonymous Computation (COINSTAC), a free and open-source platform. COINSTAC allows researchers to perform identical preprocessing, statistical analysis, and machine learning across multiple sites in real time. All operations are encapsulated inside Docker containers, which are open and available for inspection on DockerHub, enabling reproducibility. Additionally, all code inside of the Docker containers is open source and available on GitHub. COINSTAC offers reproducible and open-source computations for voxel-based morphometry, fMRI preprocessing, regression, classification, Group ICA, functional network connectivity, tSNE, and more. In the demo, we will show how to use COINSTAC to run reproducible, open, decentralized workflows.
Useful Links
https://github.com/trendscenter/coinstac https://www.youtube.com/watch?v=QL95M74usAA
Tagging @everner