Closed mjpieters closed 4 years ago
I attempted to get CircleCI running under my own account, but failed. I'm not going to spend any further time on that.
For what it's worth, I can run the tests locally and they pass. That's not that revelatory, because the graph module has no test coverage and I didn't touch anything else.
However, I constructed a simple manual test file, based on one of the examples in the documentation, which renders the same graph on master and with this pull request:
import pandas
from CHAID import Tree
tree = Tree.from_pandas_df(
pandas.read_csv("tests/data/titanic.csv"),
dict.fromkeys({"sex", "embarked"}, "nominal"),
"survived",
max_depth=4,
alpha_merge=0.05,
min_parent_node_size=2,
)
tree.render("rendered.dot")
This is essentially the same thing as python -m CHAID tests/data/titanic.csv survived sex embarked --max-depth 4 --min-parent-node-size 2 --alpha-merge 0.05 --export-path rendered.dot
but a little easier to tweak and play with.
I note that this reveals a separate problem, as dot
issues this warning (both with and without my changes):
Warning: Orthogonal edges do not currently handle edge labels. Try using xlabels.
That's an easy fix so I'll just include that in this pull request next: replacing label
with xlabel
in the g.edge()
call.
With that last fix, my little test script outputs:
Thank you so much for the code. I run the code and i m getting the error as TypeError: sequence item 0: expected str instance, int found in edge_label = " ({}) \n ".format(', '.join(node.choices)). I could not able to find the solution, please help me out in fixing this...
The code I used as follows below
from CHAID import Tree
import pandas as pd
import numpy as np
import os
df=pd.read_csv('C:\\Users\\ps\\chaid_pro.csv')
independent_variable_columns = ['gender', 'grade', 'no_renewals', 'complaint_count']
dep_variable = 'switch'
tree = Tree.from_pandas_df(
df,
dict(zip(independent_variable_columns, ['nominal'] * 38)),
dep_variable,
max_depth=2
)
import os
from datetime import datetime
import plotly.graph_objs as go
import plotly.io as pio
import colorlover as cl
from graphviz import Digraph
import tempfile
try:
# Python 3.2 and newer
from tempfile import TemporaryDirectory
except ImportError:
# minimal backport of TemporaryDirectory for Python 2.7, sufficient
# for use with this module.
import shutil
from tempfile import mkdtemp
class TemporaryDirectory(object):
def __init__(self):
self.name = mkdtemp()
def __enter__(self):
return self.name
def __exit__(self, *args):
shutil.rmtree(self.name, ignore_errors=True)
FIG_BASE = {
"layout": {
"margin_t": 50,
"annotations": [{"font_size": 18, "x": 0.5, "y": 0.5}, {"y": [0, 0.2]}],
},
}
FIG_BASE_DATA = {
"domain": {"x": [0, 1], "y": [0.4, 1.0]},
"hole": 0.4,
"type": "pie",
"marker_colors": cl.scales["5"]["qual"]["Set1"],
}
TABLE_HEADER = ["<i>p</i>", "score", "splitting on"]
TABLE_CONFIG = {
"domain": {"x": [0.3, 0.7], "y": [0, 0.37]},
"header": {"fill_color": "#FFF"},
}
TABLE_CELLS_CONFIG = {
"line_color": "#FFF",
"align": "left",
"font_color": "#282828",
"height": 27,
"fill_color": ["#EBC1EE", "#EDEAFB"],
}
class Graph(object):
def __init__(self, tree):
self.tree = tree
def render(self, path, view):
if path is None:
path = os.path.join("trees", "{:%Y-%m-%d %H:%M:%S}.gv".format(datetime.now()))
with TemporaryDirectory() as self.tempdir:
g = Digraph(
format="png",
graph_attr={"splines": "ortho"},
node_attr={"shape": "plaintext", "labelloc": "b"},
)
for node in self.tree:
image = self.bar_chart(node)
g.node(str(node.node_id), image=image)
if node.parent is not None:
edge_label = " ({}) \n ".format(', '.join(node.choices))
g.edge(str(node.parent), str(node.node_id), xlabel=edge_label)
g.render(path, view=view)
def bar_chart(self, node):
fig = dict(
data=[
dict(
values=list(node.members.values()),
labels=list(node.members),
showlegend=(node.node_id == 0),
**FIG_BASE_DATA
)
],
**FIG_BASE
)
if not node.is_terminal:
fig["data"].append(self._table(node))
filename = os.path.join(self.tempdir, "node-{}.png".format(node.node_id))
pio.write_image(fig, file=filename, format="png")
return filename
def _table(self, node):
p = None if node.p is None else format(node.p, ".5f")
score = None if node.score is None else format(node.score, ".2f")
values = [p, score, node.split.column]
return go.Table(
cells=dict(values=[TABLE_HEADER, values], **TABLE_CELLS_CONFIG),
**TABLE_CONFIG
)
tree.render("rendered.dot")
Thank you so much for the code. I run the code and now i m getting the error as Warning: No such file or directory while opening C:\Users\RANJIT~1.PS\AppData\Local\Temp\tmpv70rowwm\node-0.png Warning: No or improper image="C:\Users\RANJIT~1.PS\AppData\Local\Temp\tmpv70rowwm\node-0.png" for node "0" Warning: No such file or directory while opening C:\Users\RANJIT~1.PS\AppData\Local\Temp\tmpv70rowwm\node-1.png Warning: No or improper image="C:\Users\RANJIT~1.PS\AppData\Local\Temp\tmpv70rowwm\node-1.png" for node "1" Warning: No such file or directory while opening C:\Users\RANJIT~1.PS\AppData\Local\Temp\tmpv70rowwm\node-2.png Warning: No or improper image="C:\Users\RANJIT~1.PS\AppData\Local\Temp\tmpv70rowwm\node-2.png" for node "2" Warning: No such file or directory while opening C:\Users\RANJIT~1.PS\AppData\Local\Temp\tmpv70rowwm\node-3.png Warning: No or improper image="C:\Users\RANJIT~1.PS\AppData\Local\Temp\tmpv70rowwm\node-3.png" for node "3" Warning: No such file or directory while opening C:\Users\RANJIT~1.PS\AppData\Local\Temp\tmpv70rowwm\node-4.png Warning: No or improper image="C:\Users\RANJIT~1.PS\AppData\Local\Temp\tmpv70rowwm\node-4.png" for node "4" Warning: No such file or directory while opening C:\Users\RANJIT~1.PS\AppData\Local\Temp\tmpv70rowwm\node-5.png Warning: No or improper image="C:\Users\RANJIT~1.PS\AppData\Local\Temp\tmpv70rowwm\node-5.png" for node "5" Warning: No such file or directory while opening C:\Users\RANJIT~1.PS\AppData\Local\Temp\tmpv70rowwm\node-6.png Warning: No or improper image="C:\Users\RANJIT~1.PS\AppData\Local\Temp\tmpv70rowwm\node-6.png" for node "6" Warning: No such file or directory while opening C:\Users\RANJIT~1.PS\AppData\Local\Temp\tmpv70rowwm\node-7.png Warning: No or improper image="C:\Users\RANJIT~1.PS\AppData\Local\Temp\tmpv70rowwm\node-7.png" for node "7" Warning: No such file or directory while opening C:\Users\RANJIT~1.PS\AppData\Local\Temp\tmpv70rowwm\node-8.png Warning: No or improper image="C:\Users\RANJIT~1.PS\AppData\Local\Temp\tmpv70rowwm\node-8.png" for node "8" Warning: No such file or directory while opening C:\Users\RANJIT~1.PS\AppData\Local\Temp\tmpv70rowwm\node-9.png Warning: No or improper image="C:\Users\RANJIT~1.PS\AppData\Local\Temp\tmpv70rowwm\node-9.png" for node "9" Warning: No such file or directory while opening C:\Users\RANJIT~1.PS\AppData\Local\Temp\tmpv70rowwm\node-10.png Warning: No or improper image="C:\Users\RANJIT~1.PS\AppData\Local\Temp\tmpv70rowwm\node-10.png" for node "10" Warning: No such file or directory while opening C:\Users\RANJIT~1.PS\AppData\Local\Temp\tmpv70rowwm\node-11.png Warning: No or improper image="C:\Users\RANJIT~1.PS\AppData\Local\Temp\tmpv70rowwm\node-11.png" for node "11" Warning: No such file or directory while opening C:\Users\RANJIT~1.PS\AppData\Local\Temp\tmpv70rowwm\node-12.png Warning: No or improper image="C:\Users\RANJIT~1.PS\AppData\Local\Temp\tmpv70rowwm\node-12.png" for node "12" Warning: No such file or directory while opening C:\Users\RANJIT~1.PS\AppData\Local\Temp\tmpv70rowwm\node-13.png Warning: No or improper image="C:\Users\RANJIT~1.PS\AppData\Local\Temp\tmpv70rowwm\node-13.png" for node "13" Warning: No such file or directory while opening C:\Users\RANJIT~1.PS\AppData\Local\Temp\tmpv70rowwm\node-14.png Warning: No or improper image="C:\Users\RANJIT~1.PS\AppData\Local\Temp\tmpv70rowwm\node-14.png" for node "14" However I could able to get the dot file by using the following command
treeG=tree.to_tree() treeG.to_graphviz()
but how do I get the nice chart as you showed above(with proportions of "0" and "1". is there any way I can pass this dot file and get that tree chart. It would be great help to me. Attached the dot file for your reference digraph tree.docx
Right, so the independent variable values can be anything, not just strings. I’ll fix this later, but the quick fix is to add conversion to strings for the edge labels. Replace ', '.join(node.choices)
with ', '.join(map(str, node.choices))
Right, so the independent variable values can be anything, not just strings. I’ll fix this later, but the quick fix is to add conversion to strings for the edge labels. Replace
', '.join(node.choices)
with', '.join(map(str, node.choices))
Yes I fixed that but now I m getting same old error as Warning: No such file or directory while opening C:\Users\RANJIT1.PS\AppData\Local\Temp\tmpv70rowwm\node-0.png Warning: No or improper image="C:\Users\RANJIT1.PS\AppData\Local\Temp\tmpv70rowwm\node-0.png" for node "0"
However I could able to get the dot file by using the following command
treeG=tree.to_tree() treeG.to_graphviz()
but how do I get the nice chart as you showed above(with proportions of "0" and "1". is there any way I can pass this dot file and get that tree chart. It would be great help to me. Attached the dot file for your reference digraph tree.docx
That's a GraphViz graph created from the treelib
tree, and contains just a subset of the information in CHAID. Your CHAID data has been turned into strings in the labels (([], {0: 2142076.0, 1: 68348.0}, (no_renewal, p=0.0, score=14911.92458184376, groups=[[0], [1]]), dof=1))
for the root, and ([0], {0: 2078360.0, 1: 60578.0}, (compliant_count, p=0.0, score=6385.601378408456, groups=[[0], [1]]), dof=1))
for the first child node, etc.) and is just not very interesting or useful.
You can use the dot
command from Graphviz to render that into an SVG or PNG but it won't be nearly as informative as the one that CHAID generates. I'd not bother with it.
That's a GraphViz graph created from the
treelib
tree, and contains just a subset of the information in CHAID. Your CHAID data has been turned into strings in the labels (([], {0: 2142076.0, 1: 68348.0}, (no_renewal, p=0.0, score=14911.92458184376, groups=[[0], [1]]), dof=1))
for the root, and([0], {0: 2078360.0, 1: 60578.0}, (compliant_count, p=0.0, score=6385.601378408456, groups=[[0], [1]]), dof=1))
for the first child node, etc.) and is just not very interesting or useful.You can use the
dot
command from Graphviz to render that into an SVG or PNG but it won't be nearly as informative as the one that CHAID generates. I'd not bother with it.
Yes sir. Your absolutely true I am unable to get the tree chart as you showed above(with proportions and nice representation of node) by passing the dot file with Graphivz. I dont know where I went wrong and unable to get the output as showed by you above though you sent full code to execute the same. If possible kindly see the code I pasted above so that I can still try to get the output. Sorry to trouble you so much.
That's a GraphViz graph created from the
treelib
tree, and contains just a subset of the information in CHAID. Your CHAID data has been turned into strings in the labels (([], {0: 2142076.0, 1: 68348.0}, (no_renewal, p=0.0, score=14911.92458184376, groups=[[0], [1]]), dof=1))
for the root, and([0], {0: 2078360.0, 1: 60578.0}, (compliant_count, p=0.0, score=6385.601378408456, groups=[[0], [1]]), dof=1))
for the first child node, etc.) and is just not very interesting or useful. You can use thedot
command from Graphviz to render that into an SVG or PNG but it won't be nearly as informative as the one that CHAID generates. I'd not bother with it.Yes sir. Your absolutely true I am unable to get the tree chart as you showed above(with proportions and nice representation of node) by passing the dot file with Graphivz. I dont know where I went wrong and unable to get the output as showed by you above though you sent full code to execute the same. If possible kindly see the code I pasted above so that I can still try to get the output. Sorry to trouble you so much.
Tried so much but still the same error as Warning: No such file or directory while opening C:\Users\AppData\Local\Temp\tmpibr_jqhz\node-0.png Warning: No or improper image="C:\Users\AppData\Local\Temp\tmpibr_jqhz\node-0.png" for node "0"
and getting tree chart as attached below
That's a GraphViz graph created from the
treelib
tree, and contains just a subset of the information in CHAID. Your CHAID data has been turned into strings in the labels (([], {0: 2142076.0, 1: 68348.0}, (no_renewal, p=0.0, score=14911.92458184376, groups=[[0], [1]]), dof=1))
for the root, and([0], {0: 2078360.0, 1: 60578.0}, (compliant_count, p=0.0, score=6385.601378408456, groups=[[0], [1]]), dof=1))
for the first child node, etc.) and is just not very interesting or useful. You can use thedot
command from Graphviz to render that into an SVG or PNG but it won't be nearly as informative as the one that CHAID generates. I'd not bother with it.Yes sir. Your absolutely true I am unable to get the tree chart as you showed above(with proportions and nice representation of node) by passing the dot file with Graphivz. I dont know where I went wrong and unable to get the output as showed by you above though you sent full code to execute the same. If possible kindly see the code I pasted above so that I can still try to get the output. Sorry to trouble you so much.
Tried so much but still the same error as Warning: No such file or directory while opening C:\Users\AppData\Local\Temp\tmpibr_jqhz\node-0.png Warning: No or improper image="C:\Users\AppData\Local\Temp\tmpibr_jqhz\node-0.png" for node "0"
and getting tree chart as attached below
Is there anyway we can execute the code by changing the temp directory I mean instead of temp dir we can use other dir for saving all related files in one place and run, will that solve my issue here ...?
Sorry, this is getting a little too hard for me to debug and troubleshoot. You can try and install the project as a whole, directly from GitHub:
pip install git+https://github.com/mjpieters/CHAID@patch-1#egg=CHAID
That installs the version of the library found in this pull request.
Sorry, this is getting a little too hard for me to debug and troubleshoot. You can try and install the project as a whole, directly from GitHub:
pip install git+https://github.com/mjpieters/CHAID@patch-1#egg=CHAID
That installs the version of the library found in this pull request.
Thank you for your response.
Thank-you very much for this.
Have tested the output locally and as you said it's devoid of any spec coverage anyway so it would be very hard for Circle CI to fail.
I've switched the Circle CI flag on to permit forked PRs to run CI tests so this shouldn't be an issue anymore.
Again, thanks.
Have tested the output locally and as you said it's devoid of any spec coverage anyway so it would be very hard for Circle CI to fail.
A test that checks that a PNG was generated would already go a long way towards detecting regressions across platforms. :-) Many CI platforms let you record files as assets to be kept for later inspection, so you can then at least do a manual spot check as well. Perhaps something to consider! :-)
This uses
tempfile.TemporaryDirectory()
to handle cleaning up temporary files for the images. This isn't available in Python 2.7 but a simple backport is included (the standard-library version goes a little further in cleaning up still, should not be needed here).tempfile.mkstemp()
to generate image filenames, those are guaranteed to be unique without needing to generate filenames based on the current timestamp.This should fix #97.
Caveat: I haven't actually run this code, it's pure dead reconning and linters. :-)