Closed pviscone closed 3 months ago
New style:
Thanks for this, these plots look like a very useful improvement. I haven't reviewed the code yet, but I'm noticing in the Gains plot that the Leaves
stats are cropped?
Thanks for this, these plots look like a very useful improvement. I haven't reviewed the code yet, but I'm noticing in the Gains plot that the
Leaves
stats are cropped?
Sorry, I was just not handling zero values. Now it is fixed
Just a comment: right now I added a dictionary like {feature_name:feature_id}
to the ensebleDict created by the xgboost converter. This should be added to all the converters.
I have not done it yet but as a workaround, if this dictionary is not present in the ensembleDict, in the plot you would see just feat_0, feat_1 ,... instead of the actual feature names.
I don't know if all the other libraries save the feature names like xgboost does. In this case, a solution could be allowing passing the feature names as an argument of the profile method.
Nice, the new plot looks great 👍
I think the addition of the feature_map
is good. I would suggest integrating it a bit more deeply, by that I mean:
ensemble_fields
of ModelBase
so that it's always expectedxgboost
, set it to a dictionary like {str(i) : i for i in range(n_features)}
Right now it's only used in the profiling obviously, but if it's available I could imagine using it in more places in the backends, so having it always set to something would potentially help avoid lots of if model.feature_map is None
later.
Then we/I can set it correctly for the other converters that support in followup PRs later.
I have applied all the suggestions.
This is what the "both" option looks like. Also i made "both" the default argument
Thanks this looks good now, merging!
For each feature: