SPARC-FAIR-Codeathon / sparc-me

A python tool to explore, enhance, and expand SPARC datasets and their descriptions
Apache License 2.0
7 stars 6 forks source link

SPARC Metadata Editor (sparc-me)

A python tool to explore, enhance, and expand SPARC datasets and their descriptions in accordance with FAIR principles.

Contributors Stargazers GitHub issues-closed Issues MIT License Contributor Covenant DOI PyPI version fury.io

Table of contents

About

This is the repository of Team sparc-me (Team #7) of the 2022 SPARC Codeathon. Click here to find out more about the SPARC Codeathon 2022. Check out the Team Section of this page to find out more about our team members.

With the exception of high-level planning by the team lead as advised by the Codeathon organisers, no work was done on this project prior to the Codeathon. Contributions from existing projects are described in the Acknowledgements Section.

Introduction

The NIH Common Fund program on Stimulating Peripheral Activity to Relieve Conditions (SPARC) focuses on understanding peripheral nerves (nerves that connect the brain and spinal cord to the rest of the body), how their electrical signals control internal organ function, and how therapeutic devices could be developed to modulate electrical activity in nerves to improve organ function. This may provide a potentially powerful way to treat a diverse set of common conditions and diseases such hypertension, heart failure, gastrointestinal disorders, and more. 60 research groups spanning 90 institutions and companies contribute to SPARC and work across over 15 organs and systems in 8 species.

The SPARC Portal provides a single user-facing online interface to all resources generated by the SPARC community that can be shared, cited, visualized, computed, and used for virtual experimentation. A key offering of the portal is the collection of well-curated, high-impact data that is being generated by SPARC-funded researchers. These datasets, along with other SPARC projects and computational simulations, can be found under the "Find Data" section of the SPARC Portal.

A SPARC dataset comprises the following data and structure:

Information regarding how to navigate a SPARC dataset and how a dataset is formatted can be found on the SPARC Portal.

The problem

There is currently no publicly available programmatic appraoch for:

This limits the ability of members of the SPARC and the wider scientific community to apply FAIR principles for:

Our solution - sparc-me

To address this problem, we have developed a python module called the SPARC Metadata Editor (sparc-me) that can be used to enhance the FAIRness of SPARC data by enabling:

Examples and guided tutorials have been created to demonstrate each of the features above.

[^1]: Please note that the schemas derived in the current version of sparc-me have been generated based on basic rules (e.g. required fields, data type etc). These will be replaced when an official schema is released by the SPARC curation team (elements of the internal schema used by the SPARC curators for curating SPARC datasets can be found here).

Impact

sparc-me will elevate the impact of the SPARC program by providing the fundamental tools needed by users to programmatically interact with SDS datasets and efficiently build novel resources and tools from SPARC data. This includes:

Setting up sparc-me

Pre-requisites

PyPI

Here is the link to our project on PyPI

pip install sparc-me

From source code

Downloading source code

Clone the sparc-me repository from github, e.g.:

git clone git@github.com:SPARC-FAIR-Codeathon/sparc-me.git

Installing dependencies

  1. Setting up virtual environment (optional but recommended). In this step, we will create a virtual environment in a new folder named venv, and activate the virtual environment.

    • Linux

      python3 -m venv venv
      source venv/bin/activate
    • Windows

      python3 -m venv venv
      venv\Scripts\activate
  2. Installing dependencies via pip

    pip install -r requirements.txt

Using sparc-me

Running tutorials

Guided tutorials have been developed describing how to use sparc-me in different scenarios:

Tutorial Description
1 Downloading an existing curated SDS dataset (human whole-body computational scaffold with embedded organs), and use existing tools to query ontology terms that have been used to annotate SDS datasets using the SciCrunch knowledgebase.
2 Creating an SDS dataset programmatically from input data, editing metadata values and filtering metadata.
3 Interacting with SDS datasets on O2SPARC with sparc-me.
4 Creating an extension of the SDS to include an additional metadata field that defines data use descriptions from the GA4GH-approved Data Use Ontology (DUO). This tutorial is a first step toward demonstrating how the SDS could be extended to describe clinical data.
5 Converting a BIDS dataset to an SDS dataset.


Running examples

In additional to the tutorials, the following examples are also provided in the example folder to help highlight the functionality of sparc-me:

Reporting issues

To report an issue or suggest a new feature, please use the issues page. Please check existing issues before submitting a new one.

Contributing

Fork this repository and submit a pull request to contribute. Before doing so, please read our Code of Conduct and Contributing Guidelines. Please add a GitHub Star to support developments!

Project structure

Cite us

If you use sparc-me to make new discoveries or use the source code, please cite us as follows:

Savindi Wijenayaka, Linkun Gao, Michael Hoffman, David Nickerson, Haribalan Kumar, Chinchien Lin, Thiranja Prasad Babarenda Gamage (2022). sparc-me: v1.0.0 - A python tool to explore, enhance, and expand SPARC datasets and their descriptions in accordance with FAIR principles. 
Zenodo. https://doi.org/10.5281/zenodo.6975692.

FAIR practices

We have assessed the FAIRness of our sparc-me tool against the FAIR Principles established for research software. The details are available in the following document.

License

sparc-me is fully open source and distributed under the very permissive Apache License 2.0. See LICENSE for more information.

Team

Acknowledgements