YODA is an observational data encoding format using YAML.
We developed the YAML Observation Data Archive & Exchange (YODA) File Format to serve as a specification for human-readable, machine-parseable, text-based data files that accommodate the full diversity of critical zone science data -- such as hydrological time series, soil profile geochemistry, biodiversity transects, etc. -- that can be organized with the Observations Data Model v2 (ODM2) Specifically, we designed the YODA File format to meet the following requirements:
A YODA File follows the data serialization and interchange format of YAML ("YAML Ain't Markup Language"), a superset of JSON (JavaScript Object Notation). YAML can be readily parsed by any modern computer language.
The key feature of a YODA file that distiguishes it from generic YAML is that a YODA file:
YODA Profiles have been developed for common dataset types to define expectations for the data array block and to facilitate data/metadata input forms/templates for the end-user.
A YODA File will be structurally validated against required and optional ODM2 fields and controlled vocabularies using JSON Schema, which provides a means for documenting the YODA File Schema and set of software tools for validating any JSON file against our schema. This work in progress can be found in the YODA-Tools repository.
We are also developing the YODA Tools library, which is built upon the ODM2PythonAPI to create YODA files from our YODA Excel Templates or from an ODM2 database and to import YODA Files into an ODM2 database. YODA Files will thus serve as an interchange format between components of the ODM2 Software Ecosystem.
The draft YODA File Specification and other YODA File documentation provide many design and implementation details, but are presently a work in progress.
The YODA file format developed out of the effort to substantially extend the CZO Display File specfication. The original CZO Display File format was developed in 2010-2011 as a means for US Critical Zone Observatories to share data in a form that was both human readable and machine parsable. The header provides structured metadata that allows the comma-separated data to be ingested into an Observations Data Model 1.1 (ODM1.1) database, such as a CUAHSI HydroServer.
There are many ways to contribute:
This work was supported by National Science Foundation Grants EAR-1224638, EAR-1332257, and ACI-1339834. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.