第五章分类决策树代码问题

zzh1161 commented 2 years ago

第五章，在分类决策树的train()函数里

sub_train_df = train_data.loc[train_data[max_feature_name] ==
                                          f].drop([max_feature_name], axis=1)

这里把当前最大特征那一行给drop掉了，导致下面通过index索引特征时会出现错误

# class Node
def predict(self, features):
        if self.root is True:
            return self.label
        return self.tree[features[self.feature]].predict(features)

建议可以直接把DTree类中的predict()函数的参数改成Dataframe格式，这样不再需要通过当先特征的下标来索引，而是直接通过特征来索引

# class DTree
def predict(self, X):
        m,n = np.shape(X)
        pred_res = []
        for i in range(m):
            temp = X.iloc[i,:]
            pred_res.append(self.tree.predict(temp))
        return pred_res

# class Node
def predict(self, test):
        if self.root is True:
            return self.label
        return self.tree[test[self.feature_name]].predict(test)

AaronYin0514 commented 6 months ago

也发现了这个问题

nicholaslsq commented 6 months ago

This is Nicholas. I've received your mail.

cyy0214 commented 6 months ago

Thank you. Your email is received and will be handled as soon as possible.Best Regards.This is an automatic reply,confirming that your e-mail was received.Thank you.

daibitao19 commented 6 months ago

您好，我已经收到您的邮件，稍后我将尽快给你回复。

fengdu78 / lihang-code

第五章分类决策树代码问题 #60