Closed AdamBanham closed 2 years ago
With a bit of messing around, something like below is possible. Where a Matplotlib.Figure is returned, allowing for some user customisation if needed, e.g. dpi settings, labels or titles.
from pm4py.objects.log.importer.xes import importer as xes_importer
from pm4py.objects.log.obj import EventLog,Trace
import matplotlib.pyplot as plt
from matplotlib.figure import Figure
from typing import List, Tuple
from os.path import join
BPIC_LOG = join(".","BPI_Challenge_2012.xes.gz")
TIME_ATTR = "time:timestamp"
def get_log() -> EventLog:
return xes_importer.apply(BPIC_LOG)
def convert_trace(trace:Trace, startingTime:float) -> Tuple[List[float]]:
timepoints = []
for event in trace:
timepoints.append(event["time:timestamp"].timestamp() - startingTime)
return timepoints
def convert_log(log:EventLog) -> List[List[float]]:
log_sequences = []
startingTime = log[0][0]["time:timestamp"].timestamp()
for trace in log:
log_sequences.append(convert_trace(trace,startingTime))
return log_sequences
def find_scale(seconds:float) -> Tuple[str,float]:
if seconds < (60 * 3):
return ("min" , 60)
elif seconds < ( 60 * 60 * 20):
return ("hr", ( 60 * 60))
elif seconds < ( 60 * 60 * 24 * 100):
return ("d", ( 60 * 60 * 24))
else:
return ("yr", ( 60 * 60 * 24 * 365))
def matplotlib_dotted_chart(log:EventLog,dpi=300,figsize=(10,10)) -> Figure:
fig = plt.figure(figsize=figsize,dpi=dpi)
ax = fig.subplots(1,1)
colormap = plt.cm.get_cmap("Accent")
sequences = convert_log(log)
for y,sequence in enumerate(sequences):
color = colormap(y % len(colormap.colors) / len(colormap.colors))
ax.plot(
sequence,
[ y for _ in range(len(sequence)) ],
"o",
color=color,
markerfacecolor="None",
markersize = 1,
)
#clean up plot
ax.set_ylim([0,y])
min_x = min([min(s) for s in sequences])
max_x = max([max(s)for s in sequences])
ax.set_xlim([min_x, max_x ])
ax.set_yticks([])
# add suitable xticks
diff_x = max_x - min_x
tickers = [ min_x] + \
[ min_x + (portion/100) * diff_x
for portion in range(10,100,10) ] + \
[ max_x ]
suffix, scale = find_scale(diff_x)
ax.set_xticks(
tickers
)
ax.set_xticklabels(
[
f"{(tick - min_x) / scale:.2f}{suffix}"
for tick
in ax.get_xticks()
],
rotation=-90
)
#add labels
ax.set_ylabel("Trace")
ax.set_xlabel("Time")
ax.set_title(f"Dotted Chart of\n {log.attributes['concept:name']}")
ax.grid(True,color="grey",alpha=0.33)
return fig
def run():
log = get_log()
fig = matplotlib_dotted_chart(log,dpi=300,figsize=(10,10))
fig.tight_layout()
fig.savefig("demo.png")
if __name__ == "__main__":
run()
Dear Adam,
Thanks for signaling. We could consider it as "new" dotted chart visualizer in future (because it's completely different from the Neato version currently available).
Have a nice day
Dear Adam, after reviewing the existing code base, the problem with "neato" was the automatic layout, which was acting even if in the .dot file all the coordinates of the nodes were provided. In the next release of PM4Py, the dotted chart and performance spectrum will have a significantly increase in their performance, due to the removal of this automatic layouting.
Thanks for the quick update,
Alongside these performance improvements, does the dotted chart function produce a viewable chart, though? In particular, for the BPIC2012 log?
Cheers
Apart from the enormous amount of points, yes
Take into account using PMTk https://pmtk.fit.fraunhofer.de/ which implements the dotted chart with sampling and GPU acceleration if you need to visualize huge logs
If I were to use PMTk, are there any issues with research ethics or copyright? My understanding is that it is a closed sourced implementation from FIT. Although the idea of GPU acceleration and sampling does sound interesting and useful, would I be able to export the visualisation, or would I need to screen capture from the application?
I was hoping for customisability over efficiency for my use case, which may not be useful for a general implementation.
I am hoping to make some reactive gif/animations for events log and be a bit over the top. For example, the animation below (with plans to add more), working with just matplotlib in python (took 15 minutes to render, but nonetheless).
However, it seems that the issue is resolved with the current implementation. Should I close the issue?
In the current version of PMTk, you have the possibility to export the dotted chart representation as SVG/PNG. Still, it is not an animation like the one that you show in your post.
I checked out the visualisation in PMtk, it does look very nice, a big improvement from the neato version imo.
But it seems like the initial reason for the issue will be resolved in the next version. So I will close the issue.
However, if you wanted some native Matplotlib visualisations, I would be happy to help.
Hi,
I tried running the following snippet in a Jupyter notebook and noticed that the output from neato (outside of pm4py) was not viewable.
I am using the BPIC 2012 log, with 13087 traces. However, once finished, neato is unable to create a viewable png (perhaps because it is too large).
Reducing the log down to 50 traces I can view a chart, but it is a bit hard to view (in juypter), is there any chance that this function could be moved to a Matplotlib interface or implementation? Where legend placement, file format for saving and sizing could be customised by the user.