DataDriftProfileSection Filtered Results

There should be an option to choose what we want as data drift profile result. At present, we get the following fields for each feature -

"current_small_hist": feature_metrics.current_small_hist,
"ref_small_hist": feature_metrics.ref_small_hist,
"feature_type": feature_metrics.feature_type,
"stattest_name": feature_metrics.stattest_name,
"drift_score": feature_metrics.p_value,
"drift_detected": feature_metrics.drift_detected,

Suppose I don't want current_small_hist and ref_small_hist in my output. I should be having an option to skip the calculatios for these fields as it takes longer to generate the data drift output . Consider the following code snippet -

https://github.com/evidentlyai/evidently/blob/0f279d3d908b20d6df47b88bff8800bbdf3d516e/src/evidently/calculations/data_drift.py#L281

ref_counts = feature_ref_data.value_counts(sort=False)
cur_counts = feature_cur_data.value_counts(sort=False)
keys = set(ref_counts.keys()).union(set(cur_counts.keys()))
for key in keys:
    if key not in ref_counts:
        ref_counts.loc[key] = 0
    if key not in cur_counts:
        cur_counts.loc[key] = 0

For a high cardinality categorical feature having thousands of categories (e.g. zipcode, ip address etc.), this loop takes longer to get the count of all keys for reference and current data which in turn delays the process. It is just used to calculate current_small_hist and ref_small_hist in data drift profile output. Therefore we should have an option to skip this calculation.

evidentlyai / evidently

DataDriftProfileSection Filtered Results #317