q2-makarsa is a plugin to incorporate some functionality from the SpiecEasi and FlashWeave packages into the QIIME 2 environment together with additional network visualisation.
QIIME 2 is a powerful, extensible, and decentralized microbiome analysis package with a focus on data and analysis transparency. QIIME 2 enables researchers to start an analysis with raw DNA sequence data and finish with publication-quality figures and statistical results.
SpiecEasi (Sparse InversE Covariance estimation for Ecological Association and Statistical Inference) is an R based package which allows the user to infer microbial ecological networks from compositional datasets typically generated from 16S amplicon sequencing.
FlashWeave is a Julia based package which predicts ecological interactions between microbes from large-scale compositional abundance data (e.g., ASV or OTU tables constructed from sequencing data) through statistical co-occurrence or co-abundance. It reports direct associations, with adjustment for bystander effects and other confounders, and can furthermore integrate environmental or technical factors into the analysis of microbial systems.
q2-makarsa is at the $\alpha$ stage. In addition to wrapping the SpiecEasi and FlashWeave packages it provides a visualisation for generated networks. As development continues additional features will be listed here.
learn_network
features are exposedq2-makarsa requires a working QIIME 2 environment, installed using conda. Please follow the "Natively installing QIIME 2" instructions. (If that link is outdated, please navigate there in the latest QIIME 2 docs.)
Make sure your conda environment is activated (as described in the QIIME 2 installation instructions), then install the dependencies:
conda install -c bioconda -c conda-forge r-spieceasi julia
julia -e 'using Pkg; Pkg.add(["FlashWeave", "ArgParse", "GraphIO"])'
In the same conda environment pip install from the q2-makarsa github repo:
pip install git+https://github.com/BenKaehler/q2-makarsa.git
From within the conda environment create a working folder and move into it
mkdir plugin-example
cd plugin-example/
This folder will contain the QIIME 2 artefacts produced by q2-makarsa at the completion of each example.
The sequencing data for this example is derived from the Sponge Microbiome Project. In particular, we will use data for the Suberitida order of sponges.
Download the data
https://github.com/ramellose/networktutorials/raw/master/Workshop%202021/sponges/Suberitida.biom
The next step is to import the BIOM file as a frequency FeatureTable within QIIME 2.
qiime tools import \
--input-path Suberitida.biom \
--type 'FeatureTable[Frequency]' \
--input-format BIOMV210Format \
--output-path sponge-feature-table.qza
The QIIME 2 artefact spongeFeatureTable.qza
should exist in the working
folder if this command was successful.
Now, we are ready to use q2-makarsa to access the SpiecEasi algorithms to infer the microbial network. The most minimal command to generate the network requires the name of artefact containing the FeatureTable and the name of the intended output artefact containing the inferred network.
qiime makarsa spiec-easi \
--i-table sponge-feature-table.qza \
--o-network sponge-net.qza
From the sponge-net.qza
network artefact a visualisation can be created
and then viewed
qiime makarsa visualise-network \
--i-network sponge-net.qza \
--o-visualization sponge-net.qzv
qiime tools view sponge-net.qzv
The network images should open in your default browser. Alternatively, you can
upload sponge-net.qva
to qiime2view. The
network containing the largest number of members is in the tab labelled Group
1 , next largest network in the tab Group 2, and so on down. Trivial
networks of two members and singletons are listed by feature in the Pairs and
Singles tab respectively.
Several parameter options exist for qiime makarsa spiec-easi
. For a full
list of parameters and the defaults execute qiime makarsa spiec-easi --help
. Some examples are below.
The algorithm utilised to infer the network can be set with -p-method
parameter switch and one of 3 keywords:
glasso
Graphical
LASSO
(default)mb
Neighbourhood selection or Meinshausen and
Bühlmann
method slr
Sparse and Low-Rank methodFor example to infer the network from the example data using the MB method execute the command
qiime makarsa spiec-easi \
--i-table sponge-feature-table.qza \
--o-network sponge-net.qza \
--p-method mb
The remaining parameters relate to selection of the optimal penalty $\lambda$ in each method's lasso like optimization problem. The network inference algorithms search for the optimal $\lambda$ penalty where the complete graph and an empty graph are at the extremes of the search range. Essentially the process is finding a balance between network sparsity and least-squares fit.
The range of $\lambda$ values tested is between --p-lambda-min-ratio
$\times\lambda{max}$ and $\lambda{max}$, where
$\lambda_{max}$ is the theoretical upper bound on $\lambda$. This upper bound
is $\max|S|$, the maximum absolute value in the data correlation matrix.
The lambda range is sampled logarithmically --p-nlambda
times.
Alternatively, we can use FlashWeave to infer the network. The commands are similar. Create the network.
qiime makarsa flashweave \
--i-table sponge-feature-table.qza \
--o-network sponge-fw-net.qza
Then generate the visualisation.
qiime makarsa visualise-network \
--i-network sponge-fw-net.qza \
--o-visualization sponge-fw-net.qzv
View the visualisation as usual
qiime tools view sponge-net.qzv
Once a network graph is generated, this can be used to identify modules of co-occurring features. This is useful for, e.g., grouping these features for downstream analyses. For module detection, q2-makarsa employs the Louvain method.
qiime makarsa louvain-communities \
--i-network sponge-net.qza \
--o-community node-map.qza
Now you can colour your nodes by community.
qiime makarsa visualise-network \
--i-network sponge-net.qza \
--m-metadata-file node-map.qza \
--o-visualization sponge-louvain-net.qzv
Alternatively you can view the resulting node map (showing which features belong to each module).
qiime metadata tabulate \
--m-input-file node-map.qza \
--o-visualization node-map.qzv
The node map can be input as feature metadata to other QIIME 2 actions. For example, the following action can be used to group the features in a feature table based on their community affiliation.
qiime feature-table group \
--i-table sponge-feature-table.qza \
--p-axis feature \
--m-metadata-file node-map.qza \
--m-metadata-column Community \
--p-mode sum \
--o-grouped-table grouped-table.qza