automl / ConfigSpace

Domain specific language for configuration spaces in Python/Cython. Useful for hyperparameter optimization and algorithm configuration.
https://automl.github.io/ConfigSpace/
Other
193 stars 89 forks source link

Importing a custom incomplete configuration space? #355

Closed ixeixe closed 2 months ago

ixeixe commented 2 months ago

Hello ConfigSpace developers! I have a question for you, and I would appreciate it if you could help me with that!

from ConfigSpace import ConfigurationSpace
myspace=ConfigurationSpace(
    space={
        "a": [1,2,3],#3 integers
        "b": [4,5,6],#3 integers
    }
)

In this example, according to the current function of ConfigSpace, the final configuration space will be a Cartesian product of the values of 2 hyperparameters, that is, there are 3×3=9 configuration cases, if in our project, the configuration combination of 【a=1 and b=5 】is invalid configuration. Currently, there are 8 types of configurations in the ConfigurationSpace space: a=1 and b=4; a=1 and b=6; a=2 and b=4; a=2 and b=5; a=2 and b=6; a=3 and b=4; a=3 and b=5; a=3 and b=6; Is there any method or function in ConfigSpace that allows me to manually import these 8 custom configurations to form a ConfigurationSpace? I have a more complex test example, in which there are 20 hyperparameters, each of which is a list of 4 integers or a list of 2 integers, and the Cartesian product generates a configuration space of more than 200 million configurations, and the invalid constraints and conditions are very complex, and I have reduced the configuration size from more than 200 million to 2000 by manual pruning, currently I want to use ConfigSpace to store these 2000 configurations, but I don't know how to import these 2000 configurations.

eddiebergman commented 2 months ago

Hi @ixeixe,

Unfortunatly there's no simple way to do this but I can offer a few suggestions:

ixeixe commented 2 months ago

Thank you very much! The ForbiddenClauses feature you suggested is really good! But I encountered a new problem, because I added 36 ForbiddenClauses, when using .sample_configuration() sampling, sometimes it succeeds and sometimes it doesn't, and it takes more than 500 samples to find 1 set of configurations, which is a bit bothering me. May I also ask your advice?

eddiebergman commented 2 months ago

I'm not sure what exactly you mean by failing, i.e. just endless loop or it gives up? In a recent pending PR #346, which did a major overhaul, the sampling is significantly faster but at the end of the day, it's done by rejection sampling.

We do no clever inspection of the forbiddens for sampling, as it would make the sampling procedure itself biased.

May I ask what you inteneded use case is, perhaps ConfigSpace may not be the right tool for the job, or at the very least, overkill for just defining a few categorical configurations? Do you have a finite set of possible configurations and could you define them programatically?

ixeixe commented 2 months ago

"Failing" means that the sample is not suitable and is gave up. I have 200 million combinations in my test case, and there are only 2000 combinations left in the pruning space, so there is only a 0.001% chance that those 2000 combinations will be sampled from 200 million. So often sampling programs often give up sampling because they don't get the right sample. Later, I coded the configuration space into 2000 configurations, so the problem has been solved. This is the first time I've asked a question on GitHub, and I'm so grateful and lucky to have received your prompt response and help!