A python tool to explore, enhance, and expand SPARC datasets and their descriptions in accordance with FAIR principles.
This is the repository of Team sparc-me (Team #7) of the 2022 SPARC Codeathon. Click here to find out more about the SPARC Codeathon 2022. Check out the Team Section of this page to find out more about our team members.
With the exception of high-level planning by the team lead as advised by the Codeathon organisers, no work was done on this project prior to the Codeathon. Contributions from existing projects are described in the Acknowledgements Section.
The NIH Common Fund program on Stimulating Peripheral Activity to Relieve Conditions (SPARC) focuses on understanding peripheral nerves (nerves that connect the brain and spinal cord to the rest of the body), how their electrical signals control internal organ function, and how therapeutic devices could be developed to modulate electrical activity in nerves to improve organ function. This may provide a potentially powerful way to treat a diverse set of common conditions and diseases such hypertension, heart failure, gastrointestinal disorders, and more. 60 research groups spanning 90 institutions and companies contribute to SPARC and work across over 15 organs and systems in 8 species.
The SPARC Portal provides a single user-facing online interface to all resources generated by the SPARC community that can be shared, cited, visualized, computed, and used for virtual experimentation. A key offering of the portal is the collection of well-curated, high-impact data that is being generated by SPARC-funded researchers. These datasets, along with other SPARC projects and computational simulations, can be found under the "Find Data" section of the SPARC Portal.
A SPARC dataset comprises the following data and structure:
Information regarding how to navigate a SPARC dataset and how a dataset is formatted can be found on the SPARC Portal.
There is currently no publicly available programmatic appraoch for:
This limits the ability of members of the SPARC and the wider scientific community to apply FAIR principles for:
To address this problem, we have developed a python module called the SPARC Metadata Editor (sparc-me) that can be used to enhance the FAIRness of SPARC data by enabling:
Examples and guided tutorials have been created to demonstrate each of the features above.
[^1]: Please note that the schemas derived in the current version of sparc-me have been generated based on basic rules (e.g. required fields, data type etc). These will be replaced when an official schema is released by the SPARC curation team (elements of the internal schema used by the SPARC curators for curating SPARC datasets can be found here).
sparc-me will elevate the impact of the SPARC program by providing the fundamental tools needed by users to programmatically interact with SDS datasets and efficiently build novel resources and tools from SPARC data. This includes:
Here is the link to our project on PyPI
pip install sparc-me
Clone the sparc-me repository from github, e.g.:
git clone git@github.com:SPARC-FAIR-Codeathon/sparc-me.git
Setting up virtual environment (optional but recommended). In this step, we will create a virtual environment in a new folder named venv, and activate the virtual environment.
Linux
python3 -m venv venv
source venv/bin/activate
Windows
python3 -m venv venv
venv\Scripts\activate
Installing dependencies via pip
pip install -r requirements.txt
Guided tutorials have been developed describing how to use sparc-me in different scenarios:
Tutorial | Description |
---|---|
1 | Downloading an existing curated SDS dataset (human whole-body computational scaffold with embedded organs), and use existing tools to query ontology terms that have been used to annotate SDS datasets using the SciCrunch knowledgebase. |
2 | Creating an SDS dataset programmatically from input data, editing metadata values and filtering metadata. |
3 | Interacting with SDS datasets on O2SPARC with sparc-me. |
4 | Creating an extension of the SDS to include an additional metadata field that defines data use descriptions from the GA4GH-approved Data Use Ontology (DUO). This tutorial is a first step toward demonstrating how the SDS could be extended to describe clinical data. |
5 | Converting a BIDS dataset to an SDS dataset. |
In additional to the tutorials, the following examples are also provided in the example folder to help highlight the functionality of sparc-me:
example_for_base_functionality.py
- Example outlining basic functionality for the loading/saving/editing of dataset/metadata.example_for_validating_schema.py
- Example showing how to validate SDS entries against the SDS schema stored in the /sparc_me/resources/templates/
folder for a given SDS version.example_for_listing_all_curated_datasets.py
- Example for listing all curated SPARC datasets from Pennsieve.example_for_accessing_dataset_protocol.py
- Example for retrieving the protocol for a curated SPARC dataset from protocosls.io.example_for_downloading_dataset_files.py
- Example for downloading files in curated SPARC datasets through the sparc-me API.To report an issue or suggest a new feature, please use the issues page. Please check existing issues before submitting a new one.
Fork this repository and submit a pull request to contribute. Before doing so, please read our Code of Conduct and Contributing Guidelines. Please add a GitHub Star to support developments!
/sparc_me/
- Parent directory of sparc-me python module./sparc_me/core/
- Core classes of sparc-me./sparc_me/resources/templates/
- Location of SPARC dataset Structure templates./examples/
- Parent directory of sparc-me examples and tutorials./examples/test_data/
- Test data used for sparc-me examples and tutorials./docs/images/
- Images used in sparc-me tutorials.If you use sparc-me to make new discoveries or use the source code, please cite us as follows:
Savindi Wijenayaka, Linkun Gao, Michael Hoffman, David Nickerson, Haribalan Kumar, Chinchien Lin, Thiranja Prasad Babarenda Gamage (2022). sparc-me: v1.0.0 - A python tool to explore, enhance, and expand SPARC datasets and their descriptions in accordance with FAIR principles.
Zenodo. https://doi.org/10.5281/zenodo.6975692.
We have assessed the FAIRness of our sparc-me tool against the FAIR Principles established for research software. The details are available in the following document.
sparc-me is fully open source and distributed under the very permissive Apache License 2.0. See LICENSE for more information.