Open jr2021 opened 2 years ago
Ideally, yes. But there are about 19 predictors in the original ensemble. Let's focus now on extending the ZC case to the tree-based predictors, which are LGBoost
, NGBoost
, and RandomForestPredictor
.
The other predictors must be available, of course, but without the option of using a ZC predictor. This also means that the get_ensemble
method must be modified to return the sets of predictors based on self.zc
Sounds good.
We found that in the zerocost branch, the XGBoost
class contains three functions specific to the zero-cost case, set_pre_computations
, _verify_zc_info
, and _set_zc_names
which are also applicable to the other tree-based predictors.
In order to not duplicate these functions, we placed them in the BaseTree
class, which is the parent class of all tree-based predictors.
One small remaining issue is a discrepancy between the zerocost and Develop implementation of fit
in XGBoost
.
In the Develop branch, it is possible for the user to load in custom hyper-parameters from a config file
def fit(self, xtrain, ytrain, train_info=None, params=None, **kwargs):
if self.hparams_from_file and self.hparams_from_file not in ['False', 'None'] \
and os.path.exists(self.hparams_from_file):
self.hyperparams = json.load(open(self.hparams_from_file, 'rb'))['xgb']
print('loaded hyperparams from', self.hparams_from_file)
elif self.hyperparams is None:
self.hyperparams = self.default_hyperparams.copy()
return super(XGBoost, self).fit(xtrain, ytrain, train_info, params, **kwargs)
while in the zerocost branch, this is not an option.
def fit(self, xtrain, ytrain, train_info=None, params=None, **kwargs):
if self.hyperparams is None:
self.hyperparams = self.default_hyperparams.copy()
return super(XGBoost, self).fit(xtrain, ytrain, train_info, params, **kwargs)
Which functionality should be adopted in the Develop_copy branch? Is this a case where the code in the zerocost branch should be taken as the more updated version?
Best to be able to read from config file, too.
In the zerocost branch,
Ensemble
has been extended to the Zero Cost case and contains a single option for its base predictor,XGBoost
. TheXGBoost
predictor has been adapted to the zero-cost case by implementing aset_pre_computations
function, as well as modifying theBaseTree
class. Currently, theEnsemble
class supports only this single predictor:Should all other predictors be available in the merged
Ensemble
class and extended to the zero-cost case?