Binary Array Linked Data: bald

A Python library for managing binary array linked data files.


This library processes payloads of various encodings and produces metadata graphs, in RDF, of the contents.

Input payloads are provided with an identity, elements within the payload are identified, with respect to that payload identity, unless they are otherwise identified, by prefix or by alias.

Only Object values (attribute values) may be literals, consistent with the RDF interpretation. All Subjects and Predicates are identified with URIs.

Some format specific implementation exists, due to the different capabilities available from different encodings.


For netCDF files, no variable to variable reference definitions are provided by the API.

For a variable to variable reference to be interpreted, an externally provided alias or predicate graph must explicitly state that a particular attribute has an rdfs:range of or a subclass of that class.

netCDF parsing is working to support the emerging draft Open Geospaatial Consortium approach to netCDF-Linked-Data:


HDF files and APIs have an object reference, indicating a reference from one element in the file to another.

These object refernces shall always be interpreted as references, regardless of the semantics of predicates (attribute names) provided by external graphs.


This library provides some limited validation capabilities.

It is expected that a valid input payload will be able to be processed into a graph.

Validation rules are limited in this implementation to require that provided HTTP URIs resolve.


Suggested configuration steps

conda create -n ncld-bald
conda config --add channels conda-forge
conda config --add channels bioconda
conda config --set show_channel_urls True
source activate ncld-bald
$ conda install --quiet --file requirements.txt

Alternative steps using virtualenv and pip

virtualenv bald
source bald/bin/activate
pip install -r requirements.txt


#install the bald module
$ python --quiet install

#running the tests
$ python -m unittest discover -s bald.tests -v

Command line tools


A HTML 'hot-linked' version of the ncdump command-line utility. The output of ncldDump is a something similar but with links to the standard names and definitions of the attributes to source documentation. See for example output HTML.

Getting started:

$ cd ncldDump
#Install requirements
$ pip install -r ../requirements.txt

$ python -a aliases.json -o test.html


A command-line tool that takes a netCDF or CDL file and outputs an RDF or JSON-LD encoding of the content. The RDF can then be imported into RDF triple stores or used with RDF libraries for reasoning, SPARQL querying and the like.

The RDFLib package is used so serialisation options to RDF are: n3, nquads, nt, rdfxml, turtle or ttl.

$ cd nc2rdf

# turn CDL into RDF
$ python  test.cdl

# turn NC into RDF
$ python

# specify RDF formats/flavours 
$ python -o turtle