This grace
repository contains a Python library ๐ for identification of patterns in imaging data. The package provides a method ๐ฅ๏ธ to find connected objects & regions of interest in images by constructing graph-like representations ๐ .
Read more about:
The acronym grace
stands for G raph R epresentation A nalysis for C onnected E mbeddings ๐๐. This tool was developed by researchers as a scientific project at The Alan Turing Institute in the Data Science for Science programme.
As the initial use case, we (see the list of contributors below) developed grace
for localising filaments in cryo-electron microscopy (cryoEM) imaging datasets as an image processing tool that automatically identifies filamentous proteins and locates the regions of interest, an accessory or binding protein.
Find out more details about the project aims & objectives here & here or visit the citation panel below to check out the overarching research projects.
The grace
workflow consists of the following steps:
grace
has been tested with Python 3.8+ on OS X.
For local development, clone the repo and install in editable mode following these guidelines:
Note: Choose which conda environment you'd like to use:
grace-env-with-napari
-> environment-with-napari.yaml
grace-env-napari-free
-> environment-napari-free.yaml
Specify your preference & follow the steps below:
# clone the grace GitHub repository
git clone https://github.com/alan-turing-institute/grace.git
cd ./grace
# create a conda playground from the respective environment.yaml
conda env create -f YOUR-CHOSEN-ENVIRONMENT.yaml
# To activate this environment, use
#
# $ conda activate grace-env-with-napari
# OR
# $ conda activate grace-env-napari-free
#
# To deactivate an active environment, use
#
# $ conda deactivate
conda activate grace-env-OF-YOUR-CHOICE
# install grace from local folder (not on pypi yet)
pip install -e ".[dev]"
# install pre-commit separately
conda install -c conda-forge pre_commit
# follow the hooks from .pre-commit-config.yaml
pre-commit install
Note: when exporting your own grace conda environment, use the following:
conda env export --no-builds > new_environment.yaml
This will allow environments to be shared between different platforms and OS. For a new install with a grace version not on pypi, please remove grace
from the requirements under pip
within the newly created yaml file.
If you currently do not have any data to test / implement GRACE on, have a look at the option of simulating a synthetic dataset as described in this README. An accessible link to some pre-annotated simulated images is coming soon! ๐ง
Our repository contains a graphical user interface (GUI) which allows the user to manually annotate the regions of interests (motifs) in their cryo-EM data.
To test the annotator, make sure you've installed the repository using the annotation environment & navigate to:
python examples/show_data.py
Demonstration of the napari widget to annotate cryo-EM images.
The recording above ๐ shows a napari-based GUI widget for annotation of the desired motifs, in our case, filamentous proteins. Follow these steps to test the plugin out:
'build graph'
function in the right-hand panel.'nodes_...'
or 'edges_...'
layer list.'annotation_...'
layer in the left-hand layer list and click on the 'brush'๐๏ธ icon at the top of the layer control.'cut graph'
function in the right-hand panel.'export...'
button on the right-hand side. Inversely, you can load previously saved annotations using the 'import...'
button.๐ง Work in progress ๐ง
The expected outcome of the grace
workflow is to identify all connected objects as individual filament instances. We tested the combinatorial optimisation step on simulated data with 3 levels of 'line-seeding' densities: dense, medium and sparse.
As you can see, the optimiser works well to identify filamentous object instances simulated at various densities, and appears to work across object cross-overs (middle image, pink objects).
More details about how this type of graph representation analysis could be applied to other image data processing will become available soon - stay tuned! ๐๐
Methodology / software development [The Alan Turing Institute]:
Dataset generation / processing [The University of Bristol]:
...and many others...
If you'd like to contribute to our ongoing work, please do not hesitate to let us know your suggestions for potential improvements by raising an issue on GitHub.
๐ง Work in progress ๐ง
We are currently writing up our methodology and key results, so please stay tuned for future updates!
In the meantime, please use the template below to cite our work:
@unpublished{grace_repository,
year = {2023},
month = {April},
publisher = {{CCP-EM} Collaborative Computational Project for Electron cryo-Microscopy},
howpublished = {Paper presented at the 2023 {CCP-EM} Spring Symposium},
url = {https://www.ccpem.ac.uk/downloads/symposium/ccp-em_symp_schedule_2023.pdf},
author = {Beatriz Costa-Gomes, Kristina Ulicna, Christorpher Soelistyo, Marjan Famili, Alan Loweโ},
title = {Deconstructing cryoEM micrographs with a graph-based analysis for effective structure detection},
abstract = {Reliable detection of structures is a fundamental step in analysis of cryoEM micrographs.
Despite intense developments of computational approaches in recent years, time-consuming hand annotating
remains inevitable and represents a rate-limiting step in the analysis of cryoEM data samples with
heterogeneous objects. Furthermore, many of the current solutions are constrained by image characteristics:
the large sizes of individual micrographs, the need to perform extensive re-training of the detection models
to find objects of various categories in the same image dataset, and the presence of artefacts that might
have similar shapes to the intended targets.
To address these challenges, we developed GRACE (Graph Representation Analysis for Connected Embeddings),
a computer vision-based Python package for identification of structural motifs in complex imaging data.
GRACE sources from large images populated with low-fidelity object detections to build a graph representation
of the entire image. This global graph is then traversed to find structured regions of interest via extracting
latent node representations from the local image patches and connecting candidate objects in a supervised manner
with a graph neural network.
Using a human-in-the-loop approach, the user is encouraged to annotate the desired motifs of interest, making
our tool agnostic to the type of object detections. The user-nominated structures are then localised and
connected using a combinatorial optimisation step, which uses the latent embeddings to decide whether the
graph nodes belong to an object instance.
Importantly, GRACE reduces the search space from millions of pixels to hundreds of nodes, which allows for
fast and efficient implementation and potential tool customisation. In addition, our method can be repurposed
to search for different motifs of interest within the same dataset in a significantly smaller time scale to
the currently available open-source methods. We envisage that our end-to-end approach could be extended to
other types of imaging data where object segmentation and detection remains challenging.}
}