Python library providing tools to visualize CLDF datasets.
Run
pip install cldfviz
If you want to create maps in image formats (PNG, JPG, PDF), the cartopy
package is needed,
which will be installed with
pip install cldfviz[cartopy]
Note: Since cartopy
has quite a few system-level requirements, installation may be somewhat tricky. Should
problems arise, https://scitools.org.uk/cartopy/docs/v0.15/installing.html may help.
If you want to create "treemaps" (i.e. use the lingreemaps package for CLDF data), you need to install via
pip install cldfviz[lingtreemaps]
cldfviz
is implemented as cldfbench
plugin, i.e. it provides subcommands for the cldfbench
command.
After installation you should see subcommands with a cldfviz.
prefix
listed when running
cldfbench -h
Help provided by the CLI is sometimes extensive and can be consulted via
cldfbench <sucommand> -h
e.g.
$ cldfbench cldfviz.map -h
usage: cldfbench cldfviz.map [-h] [--download-dir DOWNLOAD_DIR] [--language-filters LANGUAGE_FILTERS]
[--glottolog GLOTTOLOG] [--glottolog-version GLOTTOLOG_VERSION]
...
A short description of the cldfviz
subcommands can be found below; for more documentation click on the images.
Examples in this documentation sometimes use CLDF data stored in the local filesystem. In particular, we'll use
If you download these datasets using the cldfbench
plugin cldfzenodo
cldfbench zenodo.download 10.5281/zenodo.7385533 --full-deposit
cldfbench zenodo.download 10.5281/zenodo.7398887 --full-deposit
cldfbench zenodo.download 10.5281/zenodo.7139937 --full-deposit
you should have the respective data in local directories wals-2020.3/
, glottolog-cldf-4.7/
and cldf-datasets-apics-4ed59b5/
.
cldfviz.map
A common way to visualize data from a CLDF StructureDataset is as "dots on a map",
i.e. as WALS-like geographic maps, displaying typological variation.
The cldfviz.map
subcommand allows you to create such maps. For details see docs/map.md.
cldfviz.text
A rather traditional visualization of linguistic data is the practice of interspersing bits of data
in descriptive texts, most obviously perhaps as examples formatted as Interlinear Glossed Text.
The cldfviz.text
subcommand allows you "render" documents written in CLDF markdown, i.e. converting
such documents to plain markdown by inserting suitable representations of the referenced data.
For details see docs/text.md.
cldfviz.examples
While it is possible to (selectively) include IGT formatted examples in CLDF Markdown via cldfviz.text
,
often it is useful to just look at an HTML formatted list of all examples from a dataset. This can
be done via cldfviz.examples
. For details see docs/examples.md.
cldfviz.tree
Phylogenetic (or classification) trees of languages are a "proper" CLDF component since CLDF 1.2 - and an obvious candidate for visualization (because noone likes to look at Newick).
To provide a configurable visualization of trees in SVG format, the
cldfviz.tree
command renders CLDF trees using the powerful toytree
package. For details see docs/tree.md.
cldfviz.treemap
Displaying maps and trees is nice, but visualizing how phylogeny relates to geography can also be done in a more integrated way as demonstrated by the lingtreemaps package. cldfviz.treemap provides a front-end for this package, making it possible to use its functionality with data and trees in CLDF datasets.
cldfviz.audiowordlist
Another case where it is often desirable to aggregate objects from different CLDF components for inspection are Wordlists with associated audio files. Displaying forms for a specified concept together with the audio as HTML page can be done running cldfviz.audiowordlist.
cldfviz.erd
CLDF datasets typically contain multiple, related tables. The most common visualization of such a data model
are "entity-relationship diagrams", i.e. diagramy of the entitty-relationship model
of the dataset. Such a diagram can be created via cldfviz.erd
(if a Java runtime is installed).
For details see docs/erd.md.
cldfviz.network
A ParameterNetwork component
was added to CLDF with version 1.3, acknowledging that in datasets like CLICS
a network of parameters (established through colexifications in CLICS) acted as both, output of the
colexification algorithm, but also as input for various cluster methods. Since there are many tools
for network analysis available, the main task for a CLDF-based tool is to convert (filtered) parts
of a ParameterNetwork
to a format that can serve as input for other tools. This is what cldfviz.network
does and since Graphviz' DOT format is one of the target
formats supported by cldfviz.network
, exploratory analysis is supported by just piping the output
into the dot
program to create a network visualization.
Other tools to convert CLDF data to "human-readable" formats: