kravis
- A {k}otlin {gra}mmar for data {vis}ualizationVisualizing tabular and relational data is the core of data-science. kravis
implements a grammar to create a wide range of plots using a standardized set of verbs.
The grammar implemented by kravis
is inspired from ggplot2
. In fact, all it provides is a more typesafe wrapper around it. Internally, ggplot2
is used as rendering engine. The API of kravis
is highly similar to allow even reusing their excellent cheatsheet.
R is required to use ggplot
. However, kravis
works with various integration backend ranging such as docker or remote webservices.
This is an experimental API and is subject to breaking changes until a first major release
An easy way to get started with kravis
is with jupyter, you simply need to install the kotlin-jupyter kernel.
See here for a notebook example.
Add the following artifact to your gradle.build
compile "com.github.holgerbrandl:kravis:0.8.5"
You can also use JitPack with Maven or Gradle to build the latest snapshot as a dependency in your project.
repositories {
maven { url 'https://jitpack.io' }
}
dependencies {
compile 'com.github.holgerbrandl:kravis:-SNAPSHOT'
}
To build and install it into your local maven cache, simply clone the repo and run
./gradlew install
Let's start by analyzing mamalian sleep patterns
import kravis.*
import org.jetbrains.kotlinx.dataframe.datasets.sleepData
sleepData
.add("rem_proportion") { "sleep_rem"<Double>() / "sleep_total"<Double>() }
// Analyze correlation
.plot(x = "sleep_total", y = "rem_proportion", color = "vore", size = "brainwt")
.geomPoint(alpha = 0.7)
.guides(size = LegendType.none)
.title("Correlation between dream and total sleep time")
Find more examples in our gallery {comding soon}.
ggplot2
and thus kravis
implement a grammar for graphics to build plots with
aesthetics
+layers
+coordinates system
+transformations
+facets
Which reads as map variables from data space to visual space
+ add one or more layers
+ configure the coordinates system
+ optionally apply statistical transformations
+ optionally add facets
. That's the way!
Every Iterable<T>
is a valid data source for kravis
, which allows to create plots using a type-save builder DSL. Essentially we first digest it into a table and use it as data source for visualization. Here's an example:
// deparse records using property references (which will allow to infer variable names via reflection)
val basePlot = sleepPatterns.plot(
x = SleepPattern::sleep_rem,
y = SleepPattern::sleep_total,
color = SleepPattern::vore,
size = SleepPattern::brainwt
)
basePlot
.geomPoint()
.title("Correlation of total sleep and and rem sleep by food preference")
.show()
In the previous example we have used property references. kravis
also supports an extractor lambda function syntax, which allow for on-the-fly data transformations when deparsing an Iterable<T>
. The (not yet solved) disadvantage is that we need to assign axis labels manually
sleepPatterns
.plot(x = { sleep_total/60 })
.geomHistogram()
.xLabel("sleep[h]")
And here's another example using a custom data class:
enum class Gender { male, female }
data class Person(val name: String, val gender: Gender, val heightCm: Int, val weightKg: Double)
// define some persons
val persons = listOf(
Person("Max", Gender.male, 192, 80.3),
Person("Anna", Gender.female, 162, 56.3),
Person("Maria", Gender.female, 172, 66.3)
)
// visualize sizes by gender
persons.plot(x = {name}, y = { weightKg }, fill = { gender.toString() })
.geomCol()
.xLabel("height [m]")
.yLabel("weight [kg]")
.title("Body Size Distribution")
kravis
can handle any kind of tabular data via data-frames
import kravis.*
import org.jetbrains.kotlinx.dataframe.datasets.irisData
irisData.plot(x="Species" , y="Petal.Length" )
.geomBoxplot()
.geomPoint(position = PositionJitter(width = 0.1), alpha = 0.3)
.title("Petal Length by Species")
kravis
auto-detects the environment, and will try to guess the most reasonable output device to show your plots. The following output devices are available.
By default kravis
will render as png
on all devices, but it also supports vector rendering using svg
as output format.
The preferred output can be configured using the SessionPrefs
object
SessionPrefs.OUTPUT_DEVICE = SwingPlottingDevice()
Currently kravis
provided 3 different options to bind an R engine which is required to render plots.
This is the default mode which can be configured by using
SessionPrefs.RENDER_BACKEND = LocalR()
SessionPrefs.RENDER_BACKEND = Docker()
This will pull and use by default the container rocker/tidyverse:3.5.1
, but can be configured to use more custom images as needed.
An (optionally) remote backend based using Rserve
Simply install the corresponding R package and start the daemon with
R -e "install.packages('Rserve',,'http://rforge.net/',type='source')"
R CMD Rserve
For configuration details see https://www.rforge.net/Rserve/doc.html
Alternatively, in case you don't have or want a local R installation, you can also run it dockerized locally or remotly with
# docker run -p <public_port>:<private_port> -d <image>
docker run -dp 6311:6311 holgerbrandl/kravis_rserve
See Dockerfile for the spec of this image.
To use the Rserve backend, configure the kravis SessionPrefs
accordingly by pointing to the correct host and port.
SessionPrefs.RENDER_BACKEND = RserveEngine(host="localhost", port=6302)
Plots are -- similar to dataframe
data-frames -- immutable.
val basePlot = mpgData.plot("displ" to x, "hwy" to y).geomPoint()
// create one version with adjusted axis text size
basePlot.theme(axisText = ElementText(size = 20.0, color = RColor.red))
// create another version with unchanged axis labels but using a log scale instead
basePlot.scaleXLog10()
Currently we just map a subset of the ggplot2
API.
Feel welcome to submit a ticket or PR if some important usecase is missing.
Since kravis
just mimics some parts of ggplot2
, and because user may want to create more custom plots we do support preambles (e.g. to define new geoms) and custom layer specs.
Example
irisData.plot(x = "Species", y = "Sepal.Length", fill = "Species")
.addPreamble("""devtools::source_url("https://git.io/fAiQN")""")
.addCustom("""geom_flat_violin(scale = "count", trim = FALSE)""")
.geomDotplot(binaxis = "y", dotsize = 0.5, stackdir = "down", binwidth = 0.1, position = PositionNudge(-0.025))
.theme(legendPosition = "none")
.labs(x = "Species", y = "Sepal length (cm)")
Run the following commands.
cd misc/docker/kravis_test/ && docker build --progress=plain -t kravis_test .
./gradlew test
You don't like it? Here are some other projects which may better suit your purpose. Before you leave, consider dropping us a ticket with some comments about whats missing, badly designed or simply broken in kravis
.
GGplot Wrappers
Other JVM visualization libraries ordered by -- personally biased -- usefullness
Other
Vega-lite based
Thanks to vega-lite team for making this project possible.
Thanks to the ggplot2 team for providing the best data vis API to date.