SemGen is an experimental software tool for automating the modular composition and decomposition of biosimulation models.
SemGen facilitates the construction of complex, integrated models, and the swift extraction of reusable submodels from larger ones. SemGen relies on the semantically-rich SemSim model description format to help automate these modeling tasks.
With SemGen, users can:
These instructions will help you use SemGen to visualize, annotate, extract, and merge models.
SemGen is a Java-based program and requires Java Runtime Environment version 1.7 (64-bit) or higher to execute.
To check your Java version, go to a command prompt and enter:
java -version
Simply download the appropriate build for your operating system from the releases page.
Windows: Download and run the Windows installer. You will then be able to run SemGen from the location where you installed it by double-clicking the SemGen.exe
file, or if using installation defaults, from the Windows Start menu.
Mac: Open the SemGen .dmg file, and drag SemGen.app
to Applications folder. Double-click SemGen.app
to start the program.
Linux: Unarchive the SemGen .tar.gz file. Double-click the SemGen.jar
file in the main SemGen directory to start the program
SemGen can be built from source using Apache Ant. From the root of the source directory, run the following two commands:
ant -buildfile build.xml build # compile the Java sources to .class files
ant -buildfile build.xml create_jar # bundle the .class files and third-part dependencies into a .jar
This will create the file SemSimAPI.jar
in the root directory. You can run this file as follows to start the Py4J server:
java -classpath ./SemSimAPI.jar semsim.Py4J
Here is a primer on how to use SemGen to load, visualize, annotate, extract, and merge models.
In SemGen, the Project tab will be your main workspace:
To load a model, click the Open model button under Project Actions on the lefthand side. This will prompt you to select a model file to load (SemGen currently supports SemSim, CellML, SBML, JSim file formats):
Once you select a model, it will be loaded in SemGen and visualized as a model node:
Alternatively, SemGen comes with a library of example models. These can be accessed by using the search bar. Hover over the magnifying glass on the top left and type in terms to search for. Click the model name in the results to load the model:
Once a model is loaded in SemGen, there are several ways to visualize and explore the model.
Select the model you want to visualize by clicking the model node (selected node will have a yellow ring around it). Then click one of the visualizations from the Project Actions menu on the lefthand side.
An entire model or submodel can be moved by clicking and dragging the hull surrounding the group of nodes. You can also adjust the view by clicking and dragging the whitespace around the model or zooming in and out using the mouse wheel.
NOTE: Occassionally, the layout algorithm may push a model's nodes drastically outside the viewing range. Re-clicking one of the visualization buttons in the Project Actions menu usual repositions the nodes inside the viewing range. See issue #214
The submodel visualization shows the hierarchical and/or compartmental organization of the model:
Each submodel node can be further expanded by double clicking it:
The dependency visualization shows the mathematical dependency network in the model:
Different node types can be hidden or shown in the Stage Option menu, which can be useful for visualizing large models:
PhysioMap displays the physiological processes and their participants (sources, sinks, and mediators) based on the semantics of the biological processes and entities:
Click here for a comprehensive Annotator tutorial.
With the Annotator tool, you can convert mathematical models into the SemSim format and annotate the model's codewords using concepts from online reference ontologies. Currently the Annotator can convert MML, SBML, and CellML models into the SemSim format. The Semantics of Biological Processes group maintains a protocol for annotating a model which can help guide the annotation process.
To annotate a model, click Annotate button under Project Actions. This will create a new Annotation tab:
Each composite annotation consists of a physical property term connected to a physical entity or physical process term. The physical entity term can itself also be a composite of ontology terms. We recommend using only terms from the Ontology of Physics for Biology (OPB) for the physical property annotation components. For the physical entity annotations we recommend using robust, thorough, and widely accepted online reference ontologies like the Foundational Model of Anatomy (FMA), Chemical Entities of Biological Interest (ChEBI), and Gene Ontology cellular components (GO-cc). For physical processes annotations, we recommend creating custom terms and defining them by identifying their thermodynamic sources, sinks and mediators from the physical entities in the model.
When you edit a composite annotation for a model codeword, the Annotator provides an interface for rapid searching and retrieval of reference ontology concepts via the BioPortal web service.
Example: Suppose you are annotating a beta cell glycolysis model that includes a codeword representing glucose concentration in the cytosol of the cell.
A detailed composite annotation would be:
OPB:Chemical concentration <propertyOf> CHEBI:glucose <part_of> FMA:Portion of cytosol <part_of> FMA:Beta cell
In this case we use the term Chemical concentration from the OPB for the physical property part of the annotation, and we compose the physical entity part by linking four concepts - one from the OPB, one from ChEBI and two from the FMA. This example illustrates the post-coordinated nature of the SemSim approach to annotation and how it provides high expressivity for annotating model terms.
The above example represents a very detailed composite annotation, however, such detail may not be necessary to disambiguate concepts in a given model. For example, there may not be any other portions of glucose within the model apart from that in the cytosol. In this case, one could use the first three terms in the composite annotation and still disambiguate the model codeword from the rest of the model's contents:
OPB:Chemical concentration <propertyOf> CHEBI:glucose
Although this annotation approach does not fully capture the biophysical meaning of the model codeword, SemGen is more likely to find semantic overlap between models if they use this shallower annotation style. This is mainly because the SemGen Merger tool currently only recognizes semantic equivalencies; it does not identify semantically similar terms in models that a user wants to integrate. Therefore, if a user wants to integrate our example glycolysis model with a TCA cycle model based on cardiac myocyte metabolism, the shallower approach would likely identify more semantic equivalencies than the more detailed approach.
Nonetheless, we recommend using the more detailed approach, given that future versions of SemGen will include a "Merging Wizard" that will identify and rank codewords that are semantically similar, not just semantically identical.
The Extractor tool provides ways to decompose SemSim models into submodels. This decomposition process is useful if you want to "carve out" a smaller portion of a given model in order to remove extraneous model features.
Below is an step-by-step example of an extraction:
The Merger tool helps automate the integration of two SemSim models. The Merger identifies the interface between two models by comparing the biological meaning of the models' codewords as expressed by their composite and singular annotations. If the two models share the same biological concept, the codewords representing this concept are mapped to each other and the user must decide which computational representation of the concept they want to preserve in the integrated model.
Below is an step-by-step example of a merge between a cardiovascular dynamics and a baroreceptor model:
Dr. Maxwell Neal originally developed the SemGen software as part of his dissertation research. Subsequently, he led a team of developers to further augment, test and evaluate SemGen under an R01 grant from the National Library of Medicine (PIs: John Gennari and Brian Carlson) that aimed to accelerate model-driven research. SemGen development has also been funded through the Virtual Physiological Rat project.
Prior contributors to SemGen development include Christopher Thompson, Graham Kim and Ryan James.
SemGen development is currently supported through the Center for Reproducible Biomedical Modeling.
SEMGEN
Copyright (c) 2010-2023 Maxwell Neal, University of Washington.
All rights reserved.
Redistribution and use in source and binary forms, with or without modification, for COMMERCIAL PURPOSES IS PROHIBITED without prior written permission from the author.
Redistribution and use in source and binary forms, with or without modification, are permitted for non-commercial purposes (such as for research, personal use, or educational use), provided that redistribution in any form includes this entire notice in all copies of the software, derivative works, and supporting documentation.
The name of the author or the University of Washington may not be used to endorse or promote software or services derived from this software without prior written permission from the author and/or the University.
THIS SOFTWARE IS PROVIDED "AS IS" AND ANY EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL MAXWELL NEAL, UNIVERSITY OF WASHINGTON, OR ANY CONTRIBUTORS TO THIS SOFTWARE BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
SemGen uses JxBrowser, which is a proprietary software, owned by TeamDev Ltd. The use of JxBrowser is governed by JxBrowser Product License Agreement http://www.teamdev.com/jxbrowser-licence-agreement. You may not use JxBrowser separately from SemGen without a separate license from TeamDev Ltd. Use of JxBrowser as part of SemGen in any commercial software requires a commercial license from TeamDev Ltd.