Open drewoldag opened 1 year ago
So, a lot of the estimators have native representations of ensembles. How would you propose to handle this in those cases?
In those cases, the default value of the configuration parameter for that stage would just be the (known, for that stage) native parameterization, no?
A couple thought. 1) I think we should only do this in a way that only touches the base class code, not any of the sub-classes as that would be rather disruptive. This is going to be kinda tricky because we don't just write the ensemble at the end, but rather we allocate the memory at the beginning of the run() and then fill in it from the parallel processes. I.e., we will have to modify the _run() and _do_chunk_output() methods to do this.
2) I think a better solution than requiring parameters for the output representation would be to use parameters that default to None but that allow you to force the qp representation to a particular type.
The function
qp.factory.convert(in_dist, class_name, **kwds)
used as
new_ensemble = qp.factory.convert(orig_ensemble, self.config.qp_output_classname, **self.config.qp_output_class_pars)
or
qp.Ensemble.convert_to(self, to_class, **kwargs)
used as
new_ensemble = orig_ensemble.convert_to(qp.factory.stats[self.config.qp_output_classname, **self.config.qp_output_class_pars)
Would allow you to convert from one representation to another.
So, this could be something like:
if self.config.qp_output_classname is not None:
new_ensemble = orig_ensemble.convert_to(qp.factory.stats[self.config.qp_output_classname, **self.config.qp_output_class_pars)
Currently almost all subclasses of
rail_base.estimator.CatEstimator
will store resultingqp.Ensembles
using aqp.interp
gridded representation. We should add a new configuration parameter to allow users to select whichqp
representation is preferred. i.e.qp.hist
,qp.spline
,qp.packed_interp
, etc...)The work here is similar to issue #11 in that the work in this repository (rail_base) is relatively small, but the work to respect the new configuration parameter in all of the subclasses of
CatEstimator
will be substantial.Also note that there will likely need to be updates made to several jupyter notebooks as well. But currently we do not have an exhaustive list of which notebooks will be affected.