plot.data is an R package for creating client-ready data for various plots and visualizations. Data can be returned as either a data.table or a json file. The json file also includes some additional information helpful for rendering various plot widgets (ex: recommended range and step for a bin width slider to accompany a histogram).
Use the R package remotes to install plot.data. From the R command prompt:
remotes::install_github('VEuPathDB/plot.data')
# or to install a specific version
remotes::install_github('VEuPathDB/plot.data', 'v1.2.3')
All plot.data
functions require at least the following arguments:
# Data object is a data.table of raw values to bin and count
df <- data.table('entity.xvar' = rnorm(100))
variables <- new("VariableMetadataList", new("VariableMetadata", variableClass = new("VariableClass", value = 'native'), variableSpec = new("VariableSpec", variableId = 'xvar', entityId = 'entity'), plotReference = new("PlotReference", value = 'xAxis'), dataType = new("DataType", value = 'NUMBER'), dataShape = new("DataShape", value = 'CONTINUOUS') ) )
histogram(data, variables, value='count', binWidth=NULL, binReportValue='binWidth', viewport=NULL)
### Example 2: Scatter with overlay
```R
# Example dataset
df <- data.table('entity.xvar' = rnorm(100),
'entity.yvar' = rnorm(100),
'entity.overlay' = sample(c('red','green','blue'), 100, replace=T))
# VariableMetadataList object
variables <- new("VariableMetadataList",
new("VariableMetadata",
variableClass = new("VariableClass", value = 'native'),
variableSpec = new("VariableSpec", variableId = 'xvar', entityId = 'entity'),
plotReference = new("PlotReference", value = 'xAxis'),
dataType = new("DataType", value = 'NUMBER'),
dataShape = new("DataShape", value = 'CONTINUOUS')
),
new("VariableMetadata",
variableClass = new("VariableClass", value = 'native'),
variableSpec = new("VariableSpec", variableId = 'overlay', entityId = 'entity'),
plotReference = new("PlotReference", value = 'overlay'),
dataType = new("DataType", value = 'STRING'),
dataShape = new("DataShape", value = 'CATEGORICAL')
),
new("VariableMetadata",
variableClass = new("VariableClass", value = 'native'),
variableSpec = new("VariableSpec", variableId = 'yvar', entityId = 'entity'),
plotReference = new("PlotReference", value = 'yAxis'),
dataType = new("DataType", value = 'NUMBER'),
dataShape = new("DataShape", value = 'CONTINUOUS')
)
)
# Returns the name of a json file where scatterplot-ready plotting data can be found.
scattergl(df,
variables,
value='bestFitLineWithRaw')
# Example dataset
df <- data.table('entity.xvar' = sample(letters[1:5], 100, replace=T),
'entity.yvar' = rnorm(100),
'entity.overlay' = sample(c('facet1','facet2','facet3'), 100, replace=T))
# VariableMetadataList object
variables <- new("VariableMetadataList",
new("VariableMetadata",
variableClass = new("VariableClass", value = 'native'),
variableSpec = new("VariableSpec", variableId = 'xvar', entityId = 'entity'),
plotReference = new("PlotReference", value = 'xAxis'),
dataType = new("DataType", value = 'STRING'),
dataShape = new("DataShape", value = 'CATEGORICAL')
),
new("VariableMetadata",
variableClass = new("VariableClass", value = 'native'),
variableSpec = new("VariableSpec", variableId = 'overlay', entityId = 'entity'),
plotReference = new("PlotReference", value = 'overlay'),
dataType = new("DataType", value = 'STRING'),
dataShape = new("DataShape", value = 'CATEGORICAL')
),
new("VariableMetadata",
variableClass = new("VariableClass", value = 'native'),
variableSpec = new("VariableSpec", variableId = 'yvar', entityId = 'entity'),
plotReference = new("PlotReference", value = 'yAxis'),
dataType = new("DataType", value = 'NUMBER'),
dataShape = new("DataShape", value = 'CONTINUOUS')
)
)
# Returns the name of a json file where boxplot-ready plotting data can be found.
box(df,
variables,
points='outliers',
mean=F,
computeStats=F)
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
Please make sure to update tests as appropriate.
Before we begin, a few definitions:
dataType
and dataShape
for more about variable types used in this package.If our goal is to add a new plot, we first ask if the addition should be an entirely new plot class, or an add-on to an existing plot class. We attempt to follow the following rule when deciding how and where to add new functionality:
Rule: Plot classes correspond to abstract plot types.
Let's take the beeswarm plot as an illustrative example. Is a beeswarm a plot type distinct enough from both box and scatter to deserve its own class? The beeswarm is similar to box in that it is meant to show a distribution of a continuous variable split across a categorical variable. However, the beeswarm in itself does not require summary points such as median, quartiles, etc. Since a beeswarm maps samples to points, perhaps it should instead be an option in the scatter class? While true, note that the variable constraints for a beeswarm and a scatterplot differ: a beeswarm takes categorical variables on the independent axis while a scatterplot does not. Therefore, let's give the beeswarm its own class.
plot.data class files
Each plot.data
class has a similar set up within their "class-plotdata-{plot name}.R" file:
newBeeswarmPD
). Each constructor begins by creating a plotdata
object (newPlotdata
).beeswarm.dt
). This function calls the class constructor. The resulting data table has columns corresponding to plottable elements and rows corresponding to groups. For example, the output of box.dt
with one facet variable will have as many rows as unique facet variable values, and columns such as "labels", "min", "median", and so on.beeswarm
).validateBeeswarmPD
).Testing
This package uses the testthat
package for testing. Each plotdata class should have a corresponding test context, i.e file called "test-{plot name}.R" in the tests/testthat directory. Tests written in this file should be basic unit tests, for example checking that the created object is of the appropriate class and size. See test-beeswarm.R
for an example.
The tests should follow the below general organization:
getJSON
output structure.Use devtools::test()
to run all unit tests in this package. See devtools documentation for more details.
Helpers
Helper functions are organized into those that compute values per group (group.R
), per panel (panel.R
), handle binning (bin.R
), or various other categories (see utils
and utils-*.R
). Using the beeswarm as an example, we can add groupMedian
to group.R
, which computes the median of the dataset per group (overlay, panel).
Exporting functions
Now that we've created a new plot, we'd like to use it! Add relevant functions to NAMESPACE
with devtools::document()
, so the new functions will get properly exported and can be used when someone loads plot.data
.