automl / ConfigSpace

Domain specific language for configuration spaces in Python. Useful for hyperparameter optimization and algorithm configuration.
https://automl.github.io/ConfigSpace/
Other
202 stars 93 forks source link

Create easy way to remove hyperparameters #267

Open eddiebergman opened 2 years ago

eddiebergman commented 2 years ago

YAHPOBench has the task id in it's configuration as a hyperparamter which is not something you would like any optimizer to know about. Should be an easy way to remove a hyperparamter by name.

If this violates some conditional where the remove hyperparamter is what is being condition on, i.e. a is active when b is x and we remove b from the space. Then we need to remove this condition too. Likewise for forbidden clauses.

Currently just removing the task id manually and updating the cache as it's not a dependant of anything.

eddiebergman commented 2 years ago

Current hack:

def remove_hyperparameter(name: str, space: ConfigurationSpace) -> None:
    """Removes a hyperparameter from a configuration space

    Essentially undoes the operations done by adding a hyperparamter
    and then runs the same validation checks that is done in ConfigSpace

    NOTE
    ----
    * Doesn't account for conditionals

    Parameters
    ----------
    name : str
        The name of the hyperparamter to remove

    space : ConfigurationSpace
        The space to remove it from
    """
    if name not in space._hyperparameters:
        raise ValueError(f"{name} not in {space}")

    assert name not in space._conditionals, "Can't handle conditionals"

    assert not any(
        name != f.hyperparameter.name for f in space.get_forbiddens()
    ), "Can't handle forbiddens"

    # No idea what this is really for
    root = "__HPOlib_configuration_space_root__"

    # Remove it from children
    if root in space._children and name in space._children[root]:
        del space._children[root][name]

    # Remove it from parents
    if root in space._parents and name in space._parents[root]:
        del space._parents[root][name]

    # Remove it from indices
    if name in space._hyperparameter_idx:
        del space._hyperparameter_idx[name]

        # We re-enumerate the dict
        space._hyperparameter_idx = {
            name: idx for idx, name in enumerate(space._hyperparameter_idx)
        }

    # Finally, remove it from the known parameter
    del space._hyperparameters[name]

    # Update according to what adding does `add_hyperparameter()`
    space._update_cache()
    space._check_default_configuration()  # TODO: Get sporadic failures here?
    space._sort_hyperparameters()

    return
mfeurer commented 2 years ago

Hey, this would also require to remove it from conditions and forbiddens, or to remove conditions and forbiddens if hyperparameters are in there.

eddiebergman commented 2 years ago

True, good spot and thank you!

eddiebergman commented 2 years ago

I just added the following two assertions for now until I can figure out how to do it properly:

    assert name not in space._conditionals, "Can't handle conditionals"

    assert not any(
        name != f.hyperparameter.name for f in space.get_forbiddens()
    ), "Can't handle forbiddens"
eddiebergman commented 2 years ago

Turns out the above hack is stochastic? The _check_default_configuration seems to randomly fail 1/4 of the time. I just get the error:

mfpbench/yahpo/benchmark.py:94: in __init__
    remove_hyperparameter("OpenML_task_id", space)
mfpbench/util.py:79: in remove_hyperparameter
    space._check_default_configuration()
ConfigSpace/configuration_space.pyx:1068: in ConfigSpace.configuration_space.ConfigurationSpace._check_default_configuration
    ???
ConfigSpace/configuration_space.pyx:1465: in ConfigSpace.configuration_space.Configuration.__init__
    ???
ConfigSpace/configuration_space.pyx:1495: in ConfigSpace.configuration_space.Configuration.is_valid_configuration
    ???
ConfigSpace/c_util.pyx:38: in ConfigSpace.c_util.check_configuration
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   ???
E   IndexError: index 7 is out of bounds for axis 0 with size 7

I think I will go with a manual copy over everything except the one param I want removed.

eddiebergman commented 2 years ago

The copy approach for the record:

def remove_hyperparameter(name: str, space: ConfigurationSpace) -> ConfigurationSpace:
    """A new configuration space with the hyperparameter removed

    Essentially copies hp over and fails if there is conditionals or forbiddens
    """
    if name not in space._hyperparameters:
        raise ValueError(f"{name} not in {space}")

    # Copying conditionals only work on objects and not named entities
    # Seeing as we copy objects and don't use the originals, transfering these
    # to the new objects is a bit tedious, possible but not required at this time
    # ... same goes for forbiddens
    assert name not in space._conditionals, "Can't handle conditionals"
    assert not any(
        name != f.hyperparameter.name for f in space.get_forbiddens()
    ), "Can't handle forbiddens"

    hps = [copy(hp) for hp in space.get_hyperparameters() if hp.name != name]

    if isinstance(space.random, np.random.RandomState):
        new_seed = space.random.randint(2 ** 32 - 1)
    else:
        new_seed = copy(space.random)

    new_space = ConfigurationSpace(
        # TODO: not sure if this will have implications, assuming not
        seed=new_seed,
        name=copy(space.name),
        meta=copy(space.meta),
    )
    new_space.add_hyperparameters(hps)
    return new_space
simonprovost commented 1 year ago

+1 for this feature handling removing conditions and stuff like that!