VarioML is a flexible framework which can be used as a template for developing serialization formats for variation data focusing on the LSDB (Locus Specific Mutation Database) use cases. Several data formats come pre-defined, but VarioML elements function as building blocks that can be restructured to fit different implementations to build new data exchange formats - from patient-centered formats to aggregated variant data formats. Originally an XML/RelaxNG framework, VarioML has been rebuilt using JSON Schema.
VarioML was first developed in the GEN2PHEN program from 2011 - 2013 and it was based on an observation model developed there. A simplified UML model can be found here. VarioML has since been reimplemented in JSON, and we're currently working on improviding the documentation with more examples.
VarioML is a collaborative, community-driven specification in active development. If you'd like to collaborate with us, please let us know through admin at varioml.org, or feel free to create an issue.
When using or discussing VarioML, please refer to:
Byrne et al (2012). VarioML framework for comprehensive variation data representation and exchange. BMC Bioinformatics. 2012 Oct 3;13:254. doi: 10.1186/1471-2105-13-254.
gene
and ref_seq
) as well as
ontology terms should use database abbreviations defined in the
MIRIAM registry.
For example, use hgnc.symbol
for gene names, refseq
for NCBI reference
sequences, and obo.so
for Sequence Ontology references.The JSON implementation is a clean redesign from the VarioML XML format, and our current focus. A JSON schema is provided to validate your data files, to provide descriptions and examples, and as the template for the documentation.
A translation for EXI is supported using the Excificient library. XML validation is supported using Schematron.
VarioML-XML is implemented in Café Variome. See the specification for the implementation.
View a collection of demo applications. Also see validation tools.
VarioML has received funding from the European Community's Seventh Framework Programme (FP7/2007-2013) under grant agreement number 200754 - the GEN2PHEN project.