Closed naveenkaushik2504 closed 7 years ago
Hello, could you try to adapt https://github.com/IBCNServices/GENESIM/blob/master/constructors/treeconstructor.py#L318 to min_samples_splits = np.arange(2,20,1) to see if the problems is fixed? It appeart that it is trying Grid Search and tries value 1 which now generates an exception in a newer version of sklearn.
Alternatively, you can param_opt=False in the construct_classifier call.
About the exceptions from xgboost, where those warnings or real exceptions?
Moreover, if specific algorithms fail, you can just remove them from the algorithms dictionary on top of the example script (https://github.com/IBCNServices/GENESIM/blob/master/example.py#L29) (this is a very easy solution ;) )
could you try to adapt https://github.com/IBCNServices/GENESIM/blob/master/constructors/treeconstructor.py#L318 to min_samples_splits = np.arange(2,20,1) to see if the problems is fixed?
I changed what you suggested along with https://github.com/IBCNServices/GENESIM/blob/master/constructors/treeconstructor.py#L247. Here the argument min_samples_split was getting set to self.min_samples_leaf. I changed this to
self.dt = DecisionTreeClassifier(criterion=self.criterion, min_samples_leaf=self.min_samples_leaf,
min_samples_split=self.min_samples_split, max_depth=self.max_depth)
It worked after this.
About the exceptions from xgboost, where those warnings or real exceptions?
A set of warnings like the following.
UserWarning: fmin_l_bfgs_b terminated abnormally with the state: {'warnflag': 2, 'task': 'ABNORMAL_TERMINATION_IN_LNSRCH', 'grad': array([ 1.16287611e-05]), 'nit': 5, 'funcalls': 50}
Moreover, if specific algorithms fail, you can just remove them from the algorithms dictionary on top of the example script (https://github.com/IBCNServices/GENESIM/blob/master/example.py#L29) (this is a very easy solution ;) )
I tried that already :). I was actually running it on my windows machine and I ran into trouble in the QUEST trees where it passes arguments using the subprocess. Will try running it on ubuntu machine.
I get the same warnings (it's from the https://github.com/fmfn/BayesianOptimization library)
I have never ran it before on a Windows machine, I hope you can get it to work! Can I ask for what goal you are trying to use my library?
I have created an ensemble of decision trees out of a GBM algorithm. Now I want to combine them into one so that I could somehow visualize the output of my model and also present it. But now I am wondering how would i integrate my trees with your algorithm. For that i guess I'll have to convert them to a format of your decisiontree class or I could implement the mutation and cross over part on my own and get the final tree. Any suggestions are most welcome on this :)
And regarding running it on a windows, it definitely is a pain and I won't recommend it but I somehow got it setup. Still, the subprocess issue seems to be a road block on this.
Hello, implementing the interface for GBM is a possibility (if you do it, definitely create a pull request for it). On the other hand, it can be done with minimal adaptation to the genetic_algorithm function (https://github.com/IBCNServices/GENESIM/blob/master/constructors/genesim.py#L378). Here, in the beginning of the function, an ensemble in constructed (tree_list), this can be removed from the function and added as a new parameter. Then you would just have to extract the decision trees, convert them to my decisiontree object and pass them along as a parameter.
There's an issue open for removing the ensemble construction from the genetic_algorithm function, so feel free to create a pull request again if you choose this way :)
Any updates @naveenkaushik2504? Else I'm closing this issue
No updates actually. I moved on to some other project. Will explore more on this later. Closing the issue.
After following the instructions to setup the GENESIM, I was trying to run it for just one dataset ( wine dataset). But I was getting exceptions when the CART algorithm tries to build the trees. Following is the snippet that i get:
Before CART it built trees with xgboost (Although with exception related to Bayesian Optimization).
What could be the possible reason for this?