cldf / cldfviz

A python library providing tools to visualize data from CLDF datasets.
Apache License 2.0
5 stars 2 forks source link

cldfviz

Build Status PyPI

Python library providing tools to visualize CLDF datasets.

Install

Run

pip install cldfviz

If you want to create maps in image formats (PNG, JPG, PDF), the cartopy package is needed, which will be installed with

pip install cldfviz[cartopy]

Note: Since cartopy has quite a few system-level requirements, installation may be somewhat tricky. Should problems arise, https://scitools.org.uk/cartopy/docs/v0.15/installing.html may help.

If you want to create "treemaps" (i.e. use the lingreemaps package for CLDF data), you need to install via

pip install cldfviz[lingtreemaps]

CLI

cldfviz is implemented as cldfbench plugin, i.e. it provides subcommands for the cldfbench command.

After installation you should see subcommands with a cldfviz. prefix listed when running

cldfbench -h

Help provided by the CLI is sometimes extensive and can be consulted via

cldfbench <sucommand> -h

e.g.

$ cldfbench cldfviz.map -h
usage: cldfbench cldfviz.map [-h] [--download-dir DOWNLOAD_DIR] [--language-filters LANGUAGE_FILTERS]
                             [--glottolog GLOTTOLOG] [--glottolog-version GLOTTOLOG_VERSION]
...

Commands

A short description of the cldfviz subcommands can be found below; for more documentation click on the images.

Example data

Examples in this documentation sometimes use CLDF data stored in the local filesystem. In particular, we'll use

If you download these datasets using the cldfbench plugin cldfzenodo

cldfbench zenodo.download 10.5281/zenodo.7385533 --full-deposit
cldfbench zenodo.download 10.5281/zenodo.7398887 --full-deposit
cldfbench zenodo.download 10.5281/zenodo.7139937 --full-deposit

you should have the respective data in local directories wals-2020.3/, glottolog-cldf-4.7/ and cldf-datasets-apics-4ed59b5/.

cldfviz.map

A common way to visualize data from a CLDF StructureDataset is as "dots on a map", i.e. as WALS-like geographic maps, displaying typological variation. The cldfviz.map subcommand allows you to create such maps. For details see docs/map.md.

details

cldfviz.text

A rather traditional visualization of linguistic data is the practice of interspersing bits of data in descriptive texts, most obviously perhaps as examples formatted as Interlinear Glossed Text. The cldfviz.text subcommand allows you "render" documents written in CLDF markdown, i.e. converting such documents to plain markdown by inserting suitable representations of the referenced data. For details see docs/text.md.

details

cldfviz.examples

While it is possible to (selectively) include IGT formatted examples in CLDF Markdown via cldfviz.text, often it is useful to just look at an HTML formatted list of all examples from a dataset. This can be done via cldfviz.examples. For details see docs/examples.md.

details

cldfviz.tree

Phylogenetic (or classification) trees of languages are a "proper" CLDF component since CLDF 1.2 - and an obvious candidate for visualization (because noone likes to look at Newick).

To provide a configurable visualization of trees in SVG format, the cldfviz.tree command renders CLDF trees using the powerful toytree package. For details see docs/tree.md.

details

cldfviz.treemap

Displaying maps and trees is nice, but visualizing how phylogeny relates to geography can also be done in a more integrated way as demonstrated by the lingtreemaps package. cldfviz.treemap provides a front-end for this package, making it possible to use its functionality with data and trees in CLDF datasets.

details

cldfviz.audiowordlist

Another case where it is often desirable to aggregate objects from different CLDF components for inspection are Wordlists with associated audio files. Displaying forms for a specified concept together with the audio as HTML page can be done running cldfviz.audiowordlist.

details

cldfviz.erd

CLDF datasets typically contain multiple, related tables. The most common visualization of such a data model are "entity-relationship diagrams", i.e. diagramy of the entitty-relationship model of the dataset. Such a diagram can be created via cldfviz.erd (if a Java runtime is installed). For details see docs/erd.md.

details

cldfviz.network

A ParameterNetwork component was added to CLDF with version 1.3, acknowledging that in datasets like CLICS a network of parameters (established through colexifications in CLICS) acted as both, output of the colexification algorithm, but also as input for various cluster methods. Since there are many tools for network analysis available, the main task for a CLDF-based tool is to convert (filtered) parts of a ParameterNetwork to a format that can serve as input for other tools. This is what cldfviz.network does and since Graphviz' DOT format is one of the target formats supported by cldfviz.network, exploratory analysis is supported by just piping the output into the dot program to create a network visualization.

details

Related

Other tools to convert CLDF data to "human-readable" formats: