serengil / chefboost

A Lightweight Decision Tree Framework supporting regular algorithms: ID3, C4.5, CART, CHAID and Regression Trees; some advanced techniques: Gradient Boosting, Random Forest and Adaboost w/categorical features support for Python
https://www.youtube.com/watch?v=Z93qE5eb6eg&list=PLsS_1RYmYQQHp_xZObt76dpacY543GrJD&index=3
MIT License
456 stars 101 forks source link

UnboundLocalError: local variable 'subdataset' referenced before assignment #30

Closed rahulramesh3321 closed 10 months ago

rahulramesh3321 commented 2 years ago

Hi,

I am facing below error while running the CHAID. Though, when used ID3, it run successfully.

[INFO]: 4 CPU cores will be allocated in parallel running CHAID tree is going to be built...

RemoteTraceback Traceback (most recent call last) RemoteTraceback: """ Traceback (most recent call last): File "C:\Users\Rahul.Chandel\Anaconda3\lib\multiprocessing\pool.py", line 125, in worker result = (True, func(*args, *kwds)) File "C:\Users\Rahul.Chandel\Anaconda3\lib\site-packages\chefboost\training\Training.py", line 209, in createBranchWrapper return func(args) File "C:\Users\Rahul.Chandel\Anaconda3\lib\site-packages\chefboost\training\Training.py", line 330, in createBranch results = buildDecisionTree(subdataset, root, file, config, dataset_features File "C:\Users\Rahul.Chandel\Anaconda3\lib\site-packages\chefboost\training\Training.py", line 533, in buildDecisionTree sub_results = createBranchWrapper(createBranch, input_param) File "C:\Users\Rahul.Chandel\Anaconda3\lib\site-packages\chefboost\training\Training.py", line 209, in createBranchWrapper return func(args) File "C:\Users\Rahul.Chandel\Anaconda3\lib\site-packages\chefboost\training\Training.py", line 330, in createBranch results = buildDecisionTree(subdataset, root, file, config, dataset_features File "C:\Users\Rahul.Chandel\Anaconda3\lib\site-packages\chefboost\training\Training.py", line 533, in buildDecisionTree sub_results = createBranchWrapper(createBranch, input_param) File "C:\Users\Rahul.Chandel\Anaconda3\lib\site-packages\chefboost\training\Training.py", line 209, in createBranchWrapper return func(args) File "C:\Users\Rahul.Chandel\Anaconda3\lib\site-packages\chefboost\training\Training.py", line 330, in createBranch results = buildDecisionTree(subdataset, root, file, config, dataset_features File "C:\Users\Rahul.Chandel\Anaconda3\lib\site-packages\chefboost\training\Training.py", line 432, in buildDecisionTree pivot = pd.DataFrame(subdataset.Decision.value_counts()).reset_index() UnboundLocalError: local variable 'subdataset' referenced before assignment """

The above exception was the direct cause of the following exception:

UnboundLocalError Traceback (most recent call last)

in 7 8 df = df.drop('fill_ratio', axis =1) ----> 9 model = cb.fit(df, config = config) ~\Anaconda3\lib\site-packages\chefboost\Chefboost.py in fit(df, config, target_label, validation_df) 211 functions.createFile(json_file, "[\n") 212 --> 213 trees = Training.buildDecisionTree(df, root = root, file = file, config = config 214 , dataset_features = dataset_features 215 , parent_level = 0, leaf_id = 0, parents = 'root', validation_df = validation_df, main_process_id = process_id) ~\Anaconda3\lib\site-packages\chefboost\training\Training.py in buildDecisionTree(df, root, file, config, dataset_features, parent_level, leaf_id, parents, tree_id, validation_df, main_process_id) 531 else: #serial 532 for input_param in input_params: --> 533 sub_results = createBranchWrapper(createBranch, input_param) 534 for sub_result in sub_results: 535 results.append(sub_result) ~\Anaconda3\lib\site-packages\chefboost\training\Training.py in createBranchWrapper(func, args) 207 208 def createBranchWrapper(func, args): --> 209 return func(*args) 210 211 def createBranch(config, current_class, subdataset, numericColumn, branch_index, winner_name, winner_index, root, parents, file, dataset_features, num_of_instances, metric, tree_id = 0, main_process_id = None): ~\Anaconda3\lib\site-packages\chefboost\training\Training.py in createBranch(config, current_class, subdataset, numericColumn, branch_index, winner_name, winner_index, root, parents, file, dataset_features, num_of_instances, metric, tree_id, main_process_id) 328 parents = copy.copy(leaf_id) 329 --> 330 results = buildDecisionTree(subdataset, root, file, config, dataset_features 331 , root-1, leaf_id, parents, tree_id = tree_id, main_process_id = main_process_id) 332 ~\Anaconda3\lib\site-packages\chefboost\training\Training.py in buildDecisionTree(df, root, file, config, dataset_features, parent_level, leaf_id, parents, tree_id, validation_df, main_process_id) 531 else: #serial 532 for input_param in input_params: --> 533 sub_results = createBranchWrapper(createBranch, input_param) 534 for sub_result in sub_results: 535 results.append(sub_result) ~\Anaconda3\lib\site-packages\chefboost\training\Training.py in createBranchWrapper(func, args) 207 208 def createBranchWrapper(func, args): --> 209 return func(*args) 210 211 def createBranch(config, current_class, subdataset, numericColumn, branch_index, winner_name, winner_index, root, parents, file, dataset_features, num_of_instances, metric, tree_id = 0, main_process_id = None): ~\Anaconda3\lib\site-packages\chefboost\training\Training.py in createBranch(config, current_class, subdataset, numericColumn, branch_index, winner_name, winner_index, root, parents, file, dataset_features, num_of_instances, metric, tree_id, main_process_id) 328 parents = copy.copy(leaf_id) 329 --> 330 results = buildDecisionTree(subdataset, root, file, config, dataset_features 331 , root-1, leaf_id, parents, tree_id = tree_id, main_process_id = main_process_id) 332 ~\Anaconda3\lib\site-packages\chefboost\training\Training.py in buildDecisionTree(df, root, file, config, dataset_features, parent_level, leaf_id, parents, tree_id, validation_df, main_process_id) 531 else: #serial 532 for input_param in input_params: --> 533 sub_results = createBranchWrapper(createBranch, input_param) 534 for sub_result in sub_results: 535 results.append(sub_result) ~\Anaconda3\lib\site-packages\chefboost\training\Training.py in createBranchWrapper(func, args) 207 208 def createBranchWrapper(func, args): --> 209 return func(*args) 210 211 def createBranch(config, current_class, subdataset, numericColumn, branch_index, winner_name, winner_index, root, parents, file, dataset_features, num_of_instances, metric, tree_id = 0, main_process_id = None): ~\Anaconda3\lib\site-packages\chefboost\training\Training.py in createBranch(config, current_class, subdataset, numericColumn, branch_index, winner_name, winner_index, root, parents, file, dataset_features, num_of_instances, metric, tree_id, main_process_id) 328 parents = copy.copy(leaf_id) 329 --> 330 results = buildDecisionTree(subdataset, root, file, config, dataset_features 331 , root-1, leaf_id, parents, tree_id = tree_id, main_process_id = main_process_id) 332 ~\Anaconda3\lib\site-packages\chefboost\training\Training.py in buildDecisionTree(df, root, file, config, dataset_features, parent_level, leaf_id, parents, tree_id, validation_df, main_process_id) 531 else: #serial 532 for input_param in input_params: --> 533 sub_results = createBranchWrapper(createBranch, input_param) 534 for sub_result in sub_results: 535 results.append(sub_result) ~\Anaconda3\lib\site-packages\chefboost\training\Training.py in createBranchWrapper(func, args) 207 208 def createBranchWrapper(func, args): --> 209 return func(*args) 210 211 def createBranch(config, current_class, subdataset, numericColumn, branch_index, winner_name, winner_index, root, parents, file, dataset_features, num_of_instances, metric, tree_id = 0, main_process_id = None): ~\Anaconda3\lib\site-packages\chefboost\training\Training.py in createBranch(config, current_class, subdataset, numericColumn, branch_index, winner_name, winner_index, root, parents, file, dataset_features, num_of_instances, metric, tree_id, main_process_id) 328 parents = copy.copy(leaf_id) 329 --> 330 results = buildDecisionTree(subdataset, root, file, config, dataset_features 331 , root-1, leaf_id, parents, tree_id = tree_id, main_process_id = main_process_id) 332 ~\Anaconda3\lib\site-packages\chefboost\training\Training.py in buildDecisionTree(df, root, file, config, dataset_features, parent_level, leaf_id, parents, tree_id, validation_df, main_process_id) 531 else: #serial 532 for input_param in input_params: --> 533 sub_results = createBranchWrapper(createBranch, input_param) 534 for sub_result in sub_results: 535 results.append(sub_result) ~\Anaconda3\lib\site-packages\chefboost\training\Training.py in createBranchWrapper(func, args) 207 208 def createBranchWrapper(func, args): --> 209 return func(*args) 210 211 def createBranch(config, current_class, subdataset, numericColumn, branch_index, winner_name, winner_index, root, parents, file, dataset_features, num_of_instances, metric, tree_id = 0, main_process_id = None): ~\Anaconda3\lib\site-packages\chefboost\training\Training.py in createBranch(config, current_class, subdataset, numericColumn, branch_index, winner_name, winner_index, root, parents, file, dataset_features, num_of_instances, metric, tree_id, main_process_id) 328 parents = copy.copy(leaf_id) 329 --> 330 results = buildDecisionTree(subdataset, root, file, config, dataset_features 331 , root-1, leaf_id, parents, tree_id = tree_id, main_process_id = main_process_id) 332 ~\Anaconda3\lib\site-packages\chefboost\training\Training.py in buildDecisionTree(df, root, file, config, dataset_features, parent_level, leaf_id, parents, tree_id, validation_df, main_process_id) 519 520 for f in funclist: --> 521 branch_results = f.get(timeout = 100000) 522 523 for branch_result in branch_results: ~\Anaconda3\lib\multiprocessing\pool.py in get(self, timeout) 769 return self._value 770 else: --> 771 raise self._value 772 773 def _set(self, i, obj): UnboundLocalError: local variable 'subdataset' referenced before assignment
serengil commented 2 years ago

i need your data set and configuration to understand the problem

rahulramesh3321 commented 2 years ago

Please find attached encoded data, though the values are categorical and almost all the variables has similar kind of data . Thanks in advance

sample_data_for_chaid.xlsx .

serengil commented 10 months ago

Closed with PR - https://github.com/serengil/chefboost/pull/46