=head1 TALK
I just gave a talk about this at L<SCaLE 17x|https://www.socallinuxexpo.org/scale/17x>. Here are the L<video of the talk|https://www.youtube.com/watch?v=Qvb_uNkFGNQ&t=12830s> and the L<"slides"|https://github.com/dkogan/talk-feedgnuplot-vnlog/blob/master/feedgnuplot-vnlog.org>.
=head1 NAME
feedgnuplot - General purpose pipe-oriented plotting tool
=head1 SYNOPSIS
Simple plotting of piped data:
$ seq 5 | awk '{print 2$1, $1$1}' 2 1 4 4 6 9 8 16 10 25
$ seq 5 | awk '{print 2$1, $1$1}' | feedgnuplot \ --lines \ --points \ --title "Test plot" \ --y2 1 \ --unset key \ --unset grid
=for html
Simple real-time plotting example: plot how much data is received on the wlan0 network interface in bytes/second. This plot updates at 1Hz, and shows the last 10sec of history. The plot shown here is the final state of a sample run
$ while true; do sleep 1; cat /proc/net/dev; done \ | gawk '/wlan0/ {if(b) {print $2-b; N++; fflush()} b=$2} N==15 {exit}' \ | feedgnuplot \ --lines \ --title "wlan0 throughput" \ --stream \ --xlen 10 \ --ylabel 'Bytes/sec' \ --xlabel seconds \ --unset key \ --unset grid
=for html
=head1 DESCRIPTION
This is a flexible, command-line-oriented frontend to Gnuplot. It creates plots from data coming in on STDIN or given in a filename passed on the commandline. Various data representations are supported, as is hardcopy output and streaming display of live data. For a tutorial and a gallery please see the guide at Lhttps://github.com/dkogan/feedgnuplot/blob/master/guide/guide.org
A simple example:
$ seq 5 | awk '{print 2$1, $1$1}' | feedgnuplot
You should see a plot with two curves. The C
The most commonly used functionality of gnuplot is supported directly by the script. Anything not directly supported can still be done with options such as C<--set>, C<--cmds> C<--style>, etc. Arbitrary gnuplot commands can be passed in with C<--cmds>. For example, to turn off the grid, you can pass in C<--cmds 'unset grid'>. Commands C<--set> and C<--unset> exists to provide nicer syntax, so this is equivalent to passing C<--unset grid>. As many of these options as needed can be passed in. To add arbitrary curve styles, use C<--style curveID extrastyle>. Pass these more than once to affect more than one curve.
To apply an extra style to I
=head2 Data formats
By default, each value present in the incoming data represents a distinct data point, as demonstrated in the original example above (we had 10 numbers in the input and 10 points in the plot). If requested, the script supports more sophisticated interpretation of input data
=head3 Domain selection
If C<--domain> is passed in, the first value on each line of input is
interpreted as the I
$ seq 5 | awk '{print 2$1, $1$1}' | feedgnuplot --domain
we get only 1 curve, with B<2,4,6,8,10> as the I
=head3 Curve indexing
We index the curves in one of 3 ways: sequentially, explicitly with a C<--dataid> or by C<--vnlog> headers.
By default, each column represents a separate curve. The first column (after any domain) is curve C<0>. The next one is curve C<1> and so on. This is fine unless sparse data is to be plotted. With the C<--dataid> option, each point is represented by 2 values: a string identifying the curve, and the value itself. If we add C<--dataid> to the original example:
$ seq 5 | awk '{print 2$1, $1$1}' | feedgnuplot --dataid --autolegend
we get 5 different curves with one point in each. The first column, as produced
by C
If we're plotting C
The C<--autolegend> option adds a legend using the given IDs to label the curves. The IDs need not be numbers; generic strings are accepted. As many points as desired can appear on a single line. C<--domain> can be used in conjunction with C<--dataid> or C<--vnlog>.
=head3 Multi-value style support
Depending on how gnuplot is plotting the data, more than one value may be needed
to represent the range of a single point. Basic 2D plots have 2 numbers
representing each point: 1 domain and 1 range. But if plotting with
C<--circles>, for instance, then there's an extra range value: the radius. Many
other gnuplot styles require more data: errorbars, variable colors (C<with
points palette>), variable sizes (C
Specific example: if making a 2d plot of y error bars, the exact format can be
queried by running C
$ echo '1 2 0.3 2 3 0.4 3 4 0.5' | feedgnuplot --domain --rangesizeall 2 --with 'yerrorbars'
$ echo '1 2 0.3 2 3 0.4 3 4 0.5' | feedgnuplot --domain --tuplesizeall 3 --with 'yerrorbars'
$ echo '1 2 1.7 2.3 2 3 2.6 3.4 3 4 3.5 4.5' | feedgnuplot --domain --rangesizeall 3 --with 'yerrorbars'
=head3 3D data
To plot 3D data, pass in C<--3d>. C<--domain> MUST be given when plotting 3D
data to avoid domain ambiguity. If 3D data is being plotted, there are by
definition 2 domain values instead of one (I
=head3 Time/date data
If the input data domain is a time/date, this can be interpreted with
C<--timefmt>. This option takes a single argument: the format to use to parse
the data. The format is documented in 'set timefmt' in gnuplot, although the
common flags that C
=over
=item
C<--xlen> and C<--binwidth> are I
=item
C<--xmin> and C<--xmax> I
=back
Using this option changes both the way the input is parsed I
$ sar 1 -1 | awk '$1 ~ /..:..:../ && $8 ~/^[0-9.]*$/ {print $1,$8; fflush()}' | feedgnuplot --stream --domain --lines --timefmt '%H:%M:%S' --set 'format x "%H:%M:%S"'
This plots the 'idle' CPU consumption against time.
Note that while gnuplot supports the time/date on any axis, I
=head3 'using' expressions
We just described how feedgnuplot parses its input data. When passing this data to gnuplot, each curve is sent independently. The domain appears in the leading columns followed by C<--rangesize> columns to complete each row. Without C<--domain>, feedgnuplot explicitly writes out sequential integers. gnuplot then knows how many values it has for each point, and it knows which style we're using, so it's able to interpret the data appropriately, and to make the correct plot.
As an example, if gnuplot is passed 2 columns of data, and it is plotting C<with
points>, it will use column 1 for the x coordinate and column 2 for the y
coordinate. This is the default behavior, but the meaning of each column can be
controlled via a C
That's how I
seq 100 | feedgnuplot --lines --usingall '1:($2*$2)'
This is powerful, but there are some things to keep in mind:
=over
=item
C<--using> overrides whatever C
=item
The C<--tuplesize> controls the data passed to feedgnuplot and the data then
passed to gnuplot. It does I
seq 10 | feedgnuplot --with 'points pt 7 palette' --usingall '1:2:2'
Here feedgnuplot read 1 column of data. It defauled to C<--tuplesize 2>, so it
passed 2 columns of data to gnuplot. gnuplot then produced 3 values for each
point, and plotted them as indicated with the C
=item
You I
seq 100 | \ awk '{print $1,$1}' | \ feedgnuplot \ --cmds 'sum=0' \ --cmds 'accum(x) = (sum=sum+x)' \ --using 1 '1:(accum($2))' \ --lines --y2 1
=back
=head2 Real-time streaming data
To plot real-time data, pass in the C<--stream [refreshperiod]> option. Data
will then be plotted as it is received. The plot will be updated every
C
To plot only the most recent data (instead of I
=head3 Special data commands
If we are reading streaming data, the input stream can contain special commands in addition to the raw data. Feedgnuplot looks for these at the start of every input line. If a command is detected, the rest of the line is discarded. These commands are
=over
=item C
This command refreshes the plot right now, instead of waiting for the next refresh time indicated by the timer. This command works in addition to the timed refresh, as indicated by C<--stream [refreshperiod]>.
=item C
This command clears out the current data in the plot. The plotting process
continues, however, to any data following the C
=item C
This command causes feedgnuplot to exit.
=back
=head2 Hardcopy output
The script is able to produce hardcopy output with C<--hardcopy outputfile>. The
output type can be inferred from the filename, if B<.ps>, B<.eps>, B<.pdf>,
B<.svg>, B<.png> or B<.gp> is requested. If any other file type is requested,
C<--terminal> I
The B<.gp> output is special. Instead of asking gnuplot to plot to a particular terminal, writing to a B<.gp> simply dumps a self-executable gnuplot script into the given file. This is similar to what C<--dump> does, but writes to a file, and makes sure that the file can be self-executing.
=head2 Self-plotting data files
This script can be used to enable self-plotting data files. There are several ways of doing this: with a shebang (#!) or with inline perl data.
=head3 Self-plotting data with a #!
A self-plotting, executable data file C is formatted as
$ cat data
2 1 4 4 6 9 8 16 10 25 12 36 14 49 16 64 18 81 20 100 22 121 24 144 26 169 28 196 30 225
This is the shebang (#!) line followed by the data, formatted as before. The data file can be plotted simply with
$ ./data
The caveats here are that on Linux the whole #! line is limited to 127 characters and that the full path to feedgnuplot must be given. The 127 character limit is a serious limitation, but this can likely be resolved with a kernel patch. I have only tried on Linux 2.6.
=head3 Self-plotting data with gnuplot
Running C<feedgnuplot --hardcopy plotdata.gp ....> will create a self-executable
gnuplot script in C
=head3 Self-plotting data with perl inline data
Perl supports storing data and code in the same file. This can also be used to create self-plotting files:
$ cat plotdata.pl
use strict; use warnings;
open PLOT, "| feedgnuplot --lines --points" or die "Couldn't open plotting pipe"; while( ) { my @xy = split; print PLOT "@xy\n"; } DATA 2 1 4 4 6 9 8 16 10 25 12 36 14 49 16 64 18 81 20 100 22 121 24 144 26 169 28 196 30 225
This is especially useful if the logged data is not in a format directly supported by feedgnuplot. Raw data can be stored after the DATA directive, with a small perl script to manipulate the data into a useable format and send it to the plotter.
=head1 ARGUMENTS
=over
=item
--C<[no]domain>
If enabled, the first element of each line is the domain variable. If not, the point index is used
=item
--C<[no]dataid>
If enabled, each data point is preceded by the ID of the data set that point corresponds to. This ID is interpreted as a string, NOT as just a number. If not enabled, the order of the point is used.
As an example, if line 3 of the input is "0 9 1 20" then
=over
=item
C<--nodomain --nodataid> would parse the 4 numbers as points in 4 different curves at x=3
=item
C<--domain --nodataid> would parse the 4 numbers as points in 3 different curves at x=0. Here, 0 is the x-variable and 9,1,20 are the data values
=item
C<--nodomain --dataid> would parse the 4 numbers as points in 2 different curves at x=3. Here 0 and 1 are the data IDs and 9 and 20 are the data values
=item
C<--domain --dataid> would parse the 4 numbers as a single point at x=0. Here 9 is the data ID and 1 is the data value. 20 is an extra value, so it is ignored. If another value followed 20, we'd get another point in curve ID 20
=back
=item
C<--vnlog>
Vnlog is a trivial data format where lines starting with C<#> are comments and
the first comment contains column labels. Some tools for working with such data
are available from the C
=item
C<--[no]3d>
Do [not] plot in 3D. This only makes sense with C<--domain>. Each domain here is an (x,y) tuple
=item
--C<timefmt [format]>
Interpret the X data as a time/date, parsed with the given format
=item
C<--colormap>
This is a legacy option used to who a colormapped xy plot. It does:
Adds C
Adds 1 to the default C<--tuplesize> (if C<--tuplesizeall> is not given
Uses C<--zmin>, C<--zmax> to set the colorbar range
It's clearer to set the relevant options explicitly, but C<--colormap> still exists for compatibility
=item
C<--stream [period]>
Plot the data as it comes in, in realtime. If period is given, replot every
period seconds. If no period is given, replot at 1Hz. If the period is given as
0 or 'trigger', replot I
=item
C<--[no]lines>
Do [not] draw lines to connect consecutive points
=item
C<--[no]points>
Do [not] draw points
=item
C<--circles>
Plot with circles. This requires a radius be specified for each point.
Automatically sets the C<--rangesize>/C<--tuplesize>. C
=item
C<--title xxx>
Set the title of the plot
=item
C<--legend curveID legend>
Set the label for a curve plot. Use this option multiple times for multiple curves. With C<--dataid>, curveID is the ID. Otherwise, it's the index of the curve, starting at 0
=item
C<--autolegend>
Use the curve IDs for the legend. Titles given with C<--legend> override these
=item
C<--xlen xxx>
When using C<--stream>, sets the size of the x-window to plot. Omit this or set
it to 0 to plot ALL the data. Does not make sense with 3d plots. Implies
C<--monotonic>. If we're plotting a histogram, then C<--xlen> causes a histogram
over a moving window to be computed. The subtlely here is that with a histogram
you don't actually I
=item
C<--xmin/xmax/x2min/x2max/ymin/ymax/y2min/y2max/zmin/zmax xxx>
Set the range for the given axis. These x-axis bounds are ignored in a streaming
plot. The x2/y2-axis bounds do not apply in 3d plots. The z-axis bounds apply
I
=item
C<--xlabel/x2label/ylabel/y2label/zlabel/cblabel xxx>
Label the given axis. The x2/y2-axis labels do not apply to 3d plots while the
z-axis label applies I
=item
C<--x2/--y2/--x1y2/--x2y1/--x2y2 xxx>
By default data is plotted against the x1 and y1 axes (the left and bottom one
respectively). If we want a particular curve plotted against a different axis,
we can specify that with these options. You pass C<--AXIS ID> where C
--y2 curveid --style curveid 'linewidth 3'
=item
C<--histogram curveID>
Set up a this specific curve to plot a histogram. The bin width is given with
the C<--binwidth> option (assumed 1.0 if omitted). If a drawing style is not
specified for this curve (C<--curvestyle>) or all curves (C<--with>,
C<--curvestyleall>) then the default histogram style is set: filled boxes with
borders. This is what the user generally wants. This works with C<--domain>
and/or C<--stream>, but in those cases the x-value is used I
=item
C<--xticlabels>
If given, the x-axis tic labels are not numerical, but are read from the data.
This changes the interpretation of the input data: with C<--domain>, each line
begins with C<x label ....>. Without C<--domain>, each line begins with C<label
...>. Clearly, the labels may not contain whitespace. This does I
=item
C<--binwidth width>
The width of bins when making histograms. This setting applies to ALL histograms in the plot. Defaults to 1.0 if not given.
=item
C<--histstyle style>
Normally, histograms are generated with the 'smooth frequency' gnuplot style.
C<--histstyle> can be used to select different C
=item
C<--style curveID style>
Additional styles per curve. With C<--dataid>, curveID is the ID. Otherwise,
it's the index of the curve, starting at 0. curveID can be a comma-separated
list of IDs to which the given style should apply. Use this option multiple
times for multiple curves. C<--styleall> does I
=item
C<--curvestyle curveID>
Synonym for C<--style>
=item
C<--styleall xxx>
Additional styles for all curves that have no C<--style>. This is overridden by any applicable C<--style>. Exclusive with C<--with>.
=item
C<--curvestyleall xxx>
Synonym for C<--styleall>
=item
C<--with xxx>
Same as C<--styleall>, but prefixed with "with". Thus
--with boxes
is equivalent to
--styleall 'with boxes'
Exclusive with C<--styleall>.
=item
C<--every curveID factor>
Decimates the input. Instead of plotting every point in the given curve, plot one point per factor. This is useful to quickly process huge datasets. For instance, to plot 1% of the data, pass a factor of 100.
=item
C<--everyall factor>
Decimates the input. This works exactly like C<--every>, except it applies to
I
=item
C<--using curveID expression>
Specifies a C
=item
C<--usingall expression>
Global "using" expressions. This works exactly like C<--using>, except it
applies to I
=item
C<--cmds xxx>
Additional commands to pass on to gnuplot verbatim. These could contain extra global styles for instance. Can be passed multiple times.
=item
C<--extracmds xxx>
Synonym for C<--cmds xxx>
=item
C<--set xxx>
Additional 'set' commands to pass on to gnuplot verbatim. C<--set 'a b c'> will
result in gnuplot seeing a C
=item
C<--unset xxx>
Additional 'unset' commands to pass on to gnuplot verbatim. C<--unset 'a b c'>
will result in gnuplot seeing a C
=item
C<--image filename>
Overlays the data on top of a raster image given in C
=item
C<--equation xxx>
Gnuplot can plot both data and symbolic equations. C
seq 100 | awk '{print $1/10, $1/100}' | feedgnuplot --with 'lines lw 3' --domain --ymax 1 --equation 'sin(x)/x' --equation 'cos(x)/x with lines lw 4'
Here I plot the incoming data (points along a line) with the given style (a line
with thickness 3), I
seq 360 | perl -nE '$th=$_/360 3.142; $c=cos($th); $s=sin($th); say "$c $s"' | feedgnuplot --domain --square --set parametric --set "trange [0:2*3.14]" --equation "sin(t),cos(t)"
Here the data I generate is points along the unit circle. I plot these as
points, and I I
=item
C<--equation-below xxx>
Synonym for C<--equation>. These are rendered I
=item
C<--equation-above xxx>
Like C<--equation>, but is rendered I
=item
C<--square>
Plot data with aspect ratio 1. For 3D plots, this controls the aspect ratio for all 3 axes
=item
C<--square-xy>
For 3D plots, set square aspect ratio for ONLY the x,y axes
=item
C<--hardcopy xxx>
If not streaming, output to a file specified here. Format inferred from
filename, unless specified by C<--terminal>. If C<--terminal> is given,
C<--hardcopy> sets I
=item
C<--terminal xxx>
String passed to 'set terminal'. No attempts are made to validate this. C<--hardcopy> sets this to some sensible defaults if C<--hardcopy> is set to a filename ending in C<.png>, C<.pdf>, C<.ps>, C<.eps> or C<.svg>. If any other file type is desired, use both C<--hardcopy> and C<--terminal>
=item
C<--maxcurves N>
The maximum allowed number of curves. This is 100 by default, but can be reset with this option. This exists purely to prevent perl from allocating all of the system's memory when reading bogus data
=item
C<--monotonic>
If C<--domain> is given, checks to make sure that the x-coordinate in the input data is monotonically increasing. If a given x-variable is in the past, all data currently cached for this curve is purged. Without C<--monotonic>, all data is kept. Does not make sense with 3d plots. No C<--monotonic> by default. The data is replotted before being purged. This is useful in streaming plots where the incoming data represents multiple iterations of the same process (repeated simulations of the same period in time, for instance).
=item
C<--rangesize curveID N>
The options C<--rangesizeall> and C<--rangesize> set the number of values are
needed to represent each point being plotted (see L</"Multi-value style
support"> above). These options are I
C<--rangesize> is used to set how many values are needed to represent the range of a point for a particular curve. This overrides any defaults that may exist for this curve only.
With C<--dataid>, curveID is the ID. Otherwise, it's the index of the curve, starting at 0. curveID can be a comma-separated list of IDs to which the given rangesize should apply.
=item
C<--tuplesize curveID N>
Very similar to C<--rangesize>, but instead of specifying the I
=item
C<--rangesizeall N>
Like C<--rangesize>, but applies to I
=item
C<--tuplesizeall N>
Like C<--tuplesize>, but applies to I
=item
C<--dump>
Instead of printing to gnuplot, print to STDOUT. Very useful for debugging. It is possible to send the output produced this way to gnuplot directly.
=item
C<--exit>
This controls what happens when the input data is exhausted, or when some part
of the C
With interactive gnuplot terminals (qt, x11, wxt), the plot windows live in a
separate process from the main C
=over
=item Alive: C
=item Half-alive: C
=item Dead: C
=back
The possibilities are:
=over
=item No C<--stream>, all data read in
=over
=item no C<--exit> (default)
Alive. Need to Ctrl-C to get back into the shell
=item C<--exit>
Half-alive. Non-interactive prompt up, and the shell accepts new commands. Without C<--stream> the goal is to show a plot, so a Dead state would not be useful.
=back
=item C<--stream>, all data read in or the C
=over
=item no C<--exit> (default)
Alive. Need to Ctrl-C to get back into the shell. This means that when making live plots, the first Ctrl-C kills the data feeding process, but leaves the final plot up for inspection. A second Ctrl-C kills feedgnuplot as well.
=item C<--exit>
Dead. No plot is shown, and the shell accepts new commands. With C<--stream> the goal is to show a plot as the data comes in, which we have been doing. Now that we're done, we can clean up everything.
=back
=back
Note that one usually invokes C
$ write_data | feedgnuplot
If the user terminates this pipeline with ^C, then I
=item
C<--geometry>
Specifies the size, position of the plot window. This applies I
=item
C<--version>
Print the version and exit
=back
=head1 RECIPES
For a tutorial and a gallery please see the guide at Lhttps://github.com/dkogan/feedgnuplot/blob/master/guide/guide.org
=head2 Basic plotting of piped data
$ seq 5 | awk '{print 2$1, $1$1}' 2 1 4 4 6 9 8 16 10 25
$ seq 5 | awk '{print 2$1, $1$1}' | feedgnuplot --lines --points --legend 0 "data 0" --title "Test plot" --y2 1
=head2 Realtime plot of network throughput
Looks at wlan0 on Linux.
$ while true; do sleep 1; cat /proc/net/dev; done | gawk '/wlan0/ {if(b) {print $2-b; fflush()} b=$2}' | feedgnuplot --lines --stream --xlen 10 --ylabel 'Bytes/sec' --xlabel seconds
=head2 Realtime plot of battery charge in respect to time
Uses the result of the C
$ while true; do acpi; sleep 15; done | perl -nE 'BEGIN{ $| = 1; } /([0-9]*)%/; say join(" ", time(), $1);' | feedgnuplot --stream --ymin 0 --ymax 100 --lines --domain --xlabel 'Time' --timefmt '%s' --ylabel "Battery charge (%)"
=head2 Realtime plot of temperatures in an IBM Thinkpad
Uses C</proc/acpi/ibm/thermal>, which reports temperatures at various locations in a Thinkpad.
$ while true; do cat /proc/acpi/ibm/thermal | awk '{$1=""; print}' ; sleep 1; done | feedgnuplot --stream --xlen 100 --lines --autolegend --ymax 100 --ymin 20 --ylabel 'Temperature (deg C)'
=head2 Plotting a histogram of file sizes in a directory, granular to 10MB
$ ls -l | awk '{print $5/1e6}' | feedgnuplot --histogram 0 --binwidth 10 --ymin 0 --xlabel 'File size (MB)' --ylabel Frequency
=head2 Plotting a live histogram of the ping round-trip times for the past 20 seconds
$ ping -D 8.8.8.8 | perl -anE 'BEGIN { $| = 1; } $F[0] =~ s/[[]]//g or next; $F[7] =~ s/.*=//g or next; say "$F[0] $F[7]"' | feedgnuplot --stream --domain --histogram 0 --binwidth 10 \ --xlabel 'Ping round-trip time (s)' \ --ylabel Frequency --xlen 20
=head2 Plotting points on top of an existing image
This can be done with C<--image>:
$ < features_xy.data feedgnuplot --points --domain --image "image.png"
or with C<--equation>:
$ < features_xy.data feedgnuplot --points --domain --equation '"image.png" binary filetype=auto flipy with rgbimage' --set 'yrange [:] reverse'
The C<--image> invocation is a convenience wrapper for the C<--equation> version. Finer control is available with C<--equation>.
Here an existing image is given to gnuplot verbatim, and data to plot on top of
it is interpreted by feedgnuplot as usual. C
=head1 ACKNOWLEDGEMENT
This program is originally based on the driveGnuPlots.pl script from Thanassis Tsiodras. It is available from his site at Lhttp://users.softlab.ece.ntua.gr/~ttsiod/gnuplotStreaming.html
=head1 REPOSITORY
Lhttps://github.com/dkogan/feedgnuplot
=head1 AUTHOR
Dima Kogan, C<< dima@secretsauce.net >>
=head1 LICENSE AND COPYRIGHT
Copyright 2011-2021 Dima Kogan.
This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License.
See http://dev.perl.org/licenses/ for more information.
=cut