rasbt / mlxtend

A library of extension and helper modules for Python's data analysis and machine learning libraries.
https://rasbt.github.io/mlxtend/
Other
4.92k stars 873 forks source link

Plotting Apriori output with arulesViz #657

Closed archienorman11 closed 4 years ago

archienorman11 commented 4 years ago

Hi-

I was wondering if anyone had managed to plot the output of the apriori algorithm to:

https://github.com/mhahsler/arulesViz

I am looking to use rpy2 to achieve this within a jupyterlab notebook.

THanks,

rasbt commented 4 years ago

Haven't tried it personally. The 2D versions look like sth that could be readily done in matplotlib. The interactive aspects sound interesting though. Regarding plotting experience and tips for association rules, maybe @dbarbier has some pointers.

dbarbier commented 4 years ago

Hello, there are several packages for interactive plots with Python, for instance bokeh, plotly, holoviews.

Here is how to reproduce example from arulesViz with Bokeh:

# assume that you called apriori like this:
#   frequent_itemsets = apriori(df, min_support=0.01, use_colnames=True)
# and already imported association_rules:
#   from mlxtend.frequent_patterns import association_rules

# 1st cell: compute association rules
rules = association_rules(frequent_itemsets, metric="confidence", min_threshold=0.7)

# 2nd cell: frozenset cause trouble for plots, replace by tuple
rules['antecedents'] = rules['antecedents'].apply(tuple)
rules['consequents'] = rules['consequents'].apply(tuple)

# 3rd cell: imports for bokeh and call output_notebook
from bokeh.io import output_notebook
from bokeh.plotting import figure, show
from bokeh.models import ColumnDataSource, HoverTool, ColorBar
from bokeh.models.mappers import LinearColorMapper
from bokeh.transform import transform
from bokeh.palettes import Viridis256
from bokeh.resources import INLINE
output_notebook(resources=INLINE)

# 4th cell: enjoy ;-)
p = figure(title = "Association rules")
p.xaxis.axis_label = 'Support'
p.yaxis.axis_label = 'Confidence'

source = ColumnDataSource(data=rules)

mapper = LinearColorMapper(palette=Viridis256,
                           low = min(rules["lift"]), 
                           high = max(rules["lift"]))

p.circle(x="support", y="confidence", source=source,
         line_color=None,
         fill_color=transform("lift", mapper),
         fill_alpha=0.2, size=8)

p.add_tools(HoverTool(tooltips=[("index", "$index"),
                                ("antecedent", "@antecedents"),
                                ("consequent", "@consequents"),
                                ("support", "@support"),
                                ("confidence", "@confidence"),
                                ("lift", "@lift")]))

bar = ColorBar(color_mapper=mapper, location=(0,0), title="lift")
p.add_layout(bar, "left")

show(p)

Everything is customizable, annotations are displayed when cursor is over circles, and there are tools at the right of the plot (in Jupyter notebook, not screenshot).

screenshot

archienorman11 commented 4 years ago

much appreciated @dbarbier

rasbt commented 4 years ago

Thanks a lot @dbarbier ! I will leave this open to remind me to add this to the docs, because it's a nice addition that some people find useful for reference. I am planning to add this to the apriori docs and then linking it in the fgrowth and fpmax docs as well. Thanks!