autogluon / tabrepo

Apache License 2.0
27 stars 7 forks source link

Ensemble order not invariant with configurations order #16

Closed geoalgo closed 11 months ago

geoalgo commented 1 year ago

Currently, evaluating ensemble gives different results based on the order of the presented configurations. It is unexpected to me as ensemble performance should be invariant to initial configuration orders (it is a set).

For instance, the following code evaluate two ensemble of the same configurations, but presented in different order. It outputs:

[[0.00642857]]
[[0.00714286]]
from autogluon_zeroshot.repository.evaluation_repository import load
from autogluon_zeroshot.utils.cache import cache_function
repo = cache_function(lambda: load(version="BAG_D244_F10_C608_FULL"), cache_name="repo")

configs = [
    'ExtraTrees_r19_BAG_L1',
    'LightGBM_r158_BAG_L1',
    'RandomForest_r5_BAG_L1',
    'LightGBM_r118_BAG_L1',
    'LightGBM_r97_BAG_L1',
    'LightGBM_r111_BAG_L1',
    'LightGBM_r71_BAG_L1',
    'NeuralNetFastAI_r82_BAG_L1',
    'NeuralNetFastAI_r25_BAG_L1',
    'NeuralNetFastAI_r145_BAG_L1',
    'NeuralNetFastAI_r128_BAG_L1',
    'NeuralNetFastAI_r121_BAG_L1',
    'NeuralNetFastAI_r173_BAG_L1',
    'CatBoost_r16_BAG_L1',
    'NeuralNetFastAI_r169_BAG_L1',
    'CatBoost_r42_BAG_L1',
    'CatBoost_r93_BAG_L1',
    'CatBoost_r2_BAG_L1',
    'CatBoost_r79_BAG_L1',
    'CatBoost_r57_BAG_L1'
]
common_kwargs = dict(tids=[3704], folds=[8], ensemble_size=50, rank=False)
print(repo.evaluate_ensemble(config_names=configs, **common_kwargs))
print(repo.evaluate_ensemble(config_names=list(sorted(configs)), **common_kwargs))
geoalgo commented 11 months ago

Solved with https://github.com/Innixma/autogluon-zeroshot-private/pull/18