jupyter / notebook

Jupyter Interactive Notebook
https://jupyter-notebook.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
11.52k stars 4.82k forks source link

decision tree #3666

Open janejp01 opened 6 years ago

janejp01 commented 6 years ago

import pandas as pd from sklearn.cross_validation import train_test_split from sklearn import tree from sklearn.externals.six import StringIO from sklearn.ensemble import RandomForestClassifier from IPython.display import Image import os import graphviz import pydotplus

import matplotlib.pyplot as plt

Prepare Data set

data = pd.read_csv('newdata.csv')

X, y = data.iloc[:, 1:], data.iloc[:, 0] X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

decision tree

clf = tree.DecisionTreeClassifier(max_depth=3) clf.fit(X_train, y_train) accuracy1 = clf.score(X_train, y_train) accuracy2 = clf.score(X_test, y_test)

analyzing decision tree, tree plot

feature_names = ['familyno', 'familytype', 'outcome', 'roomspace', 'medicalfee', 'gender', 'age', 'insurancetype', 'worktime', 'professiontype', 'obesity', 'hyperlipidemia', 'highbloodpressure', 'tress', 'tabacono']

dot_data = StringIO()

dot_data=tree.export_graphviz(clf,feature_names=feature_names)

,out_file="mytree.dot"

graph= graphviz.Source(dot_data)

graph.write_pdf('tree.pdf')

graph = pydotplus.graph_from_dot_data(dot_data)

Image(graph.create_png())

graph[0].write_pdf('tree.pdf')

randomforest

forest = RandomForestClassifier(n_estimators=5) forest.fit(X_train, y_train)

accuray3 = forest.score(X_test, y_test)

takluyver commented 6 years ago

Did you have an issue? This is just some code.

janejp01 commented 6 years ago

why I cannot get tree graph when I run this code in windows. Thank you for your suggestions and comments.