automl / neps

Neural Pipeline Search (NePS): Helps deep learning experts find the best neural pipeline.
https://automl.github.io/neps/
Apache License 2.0
51 stars 12 forks source link

Treating Ordinals as purely Categorical may make optimizers weaker than they should be #3

Open eddiebergman opened 1 year ago

eddiebergman commented 1 year ago

See these lines when converting form ConfigSpace spaces to Neps space: https://github.com/automl/neps/blob/8b2d49887364d49e05c21f9f3899f064e7e4fafa/neps/search_spaces/search_space.py#L33-L42

This may be fine for small ordinals like ["small", "medium", "large"], just treating them as a categorical, but, for tabular benchmarks this may be an issue.

For example, consider a benchmark which only has tabular entries for hyperparameters x, y, i.e.

x = Ordinal([1, 1.5, 4.5, 16, 32.354, ..., 100])
y = Ordinal(["small", "medium", "large"])

An optimizer which takes into account their order should theoretically outperform one which doesn't, i.e. SMAC.

One hacky solution is to convert it to an integer representation that act as indices so order information is presevered?