Open fsaforo1 opened 12 months ago
Hi @fsaforo1 -- This is very interesting. To be honest, I don't fully understand the implications of this cooperative objective vs training all the features in a single model. If you're interested in meeting with us and discussing it, send us an email at interpret@microsoft.com
Just a couple of quick thoughts that come to mind: 1) Unless you are using an identity link function, you probably want to apply the link function to the predictions returned from self.model.predict, then find the optimized weights in the additive domain. During predict you'll want to apply the link function again, re-weight the predictions, then reapply the inverse link function. 2) mergeebms does indeed currently require identical feature sets, but it does not require identical additive terms. One quick hack to make this work would be to build the two EBMs using a superset of the features. You can use the "exclude" parameter of the __init_\ function to exclude the B terms from the A model that you build, and vice versa. You'll also need to exclude all the possible interaction terms from the features that you don't want to cross contaminate. There's another tricky aspect that when you merge EBMs where some of the terms are missing in the other model, it currently assumes the term values on the other EBM are essentially zero, which means averaging will decrease their contribution, whereas for this merge you want them to maintain their full strength. You can fix this issue by scaling the models prior to merging by a factor of 2.0, given they share the same 'y' in this example. (see: https://interpret.ml/docs/ExplainableBoostingClassifier.html#interpret.glassbox.ExplainableBoostingClassifier.scale) 3) We also expose a "measure_interactions" function that allows you to customize interaction detection. This might be useful if you want to customize the interaction detection to allow pairs across the A/B feature separation. You can then re-train your models while specifying the interactions explicitly. https://interpret.ml/docs/measure_interactions.html
@fsaforo1, you might find this other thread regarding reweighing terms interesting https://github.com/interpretml/interpret/issues/460
@fsaforo1 You might be interested in this package for multi-view/multi-modal data: https://mvlearn.github.io/ . Maybe you can use EBM as the model for each of the 3 views you mentioned, then train those 3 EBM models using mvlearn so they can be trained in a way that account for complementing views that hold differing statistical properties.
@interpret-ml @paulbkochms
First off, thanks for building this amazing tool!
The Request
I am interested in exploring the implementation of cooperative learning in EBMs through a specialized loss objective. This objective would allow EBMs to learn from an ensemble of additive models, each corresponding to different feature sets, and encourage these models to work in a cooperative manner.
Practical Example: Air Quality and Public Health Modeling
Scenario: Environmental scientists are tasked with assessing the influence of air pollution on public health within urban settings. They collate data from diverse streams:
Meteorological patterns are known to modulate pollutant dispersal and concentrations, which in turn have direct consequences on health outcomes. Socioeconomic factors further modulate a population's exposure and susceptibility to pollution-related health risks.
Proposed Objective Function
I propose a cooperative loss objective to be optimized, as follows, considering the first two views for simplicity:
$$ \min_{f, g} \frac{1}{2} \sum_i \left(y_i - \sum_j fj(A{ij}) - \sum_k gk(B{ik})\right)^2 + \frac{\rho}{2} \sum_i \left(\sum_j fj(A{ij}) - \sum_k gk(B{ik})\right)^2 $$
Where:
Implication of the $\rho$ Parameter: The parameter $\rho$ is essential for tuning the degree of cooperation between the different data views:
What the $\rho$ parameter could be doing during training The $\rho$ parameter essentially the period in learning where learnings from different views can be combined. For example:
Some Thoughts on Potential Implementation
group interaction pairs
: Is it possible to automatically assess systematic pairs of interactions where the combination of a set features can be interacted with a single (or another set of features). Example of explicitly defining systematic pair of interactions:ExplainableBoostingRegressor(interactions = [([X_feats], [B_feats]), ([C_feats], 'feat_12')])
. The ideal will be if the strength of such group pairs could be automatically assessed during training.merge_ebms
: This could be possible with a method similar tomerge_ebms
if:Some Practical Rationale for Cooperative Learning
Utilizing cooperative learning, researchers can harness various data views for enhanced predictions and insights. In the context of our air quality problem, objectives include: