Closed simonprovost closed 1 year ago
I am not familiar with how to use SMAC specifically. Based on the example in the readme, you probably want to convert the GAMA search space to a ConfigSpace search space, and have some way to convert between a Configuration
and an Individual
(assuming there is no way to make SMAC understand GAMA's "language" for search space and individuals). Then you may hopefully be able to use most tools out of the box. Extending the Search hyperparameters to also receive the search space is something I would support.
If this doesn't answer your question, please be more concrete. For my end, don't worry too much about the timeline. I don't have much time in June unfortunately.
As pointed out by @PGijsbers, the discussion is continued in thread #202, as the current thread was primarily intended to examine how to approach the situation in a general sense but was insufficiently specific. There is no further need to continue this discussion, so I will now close but #202 for newcomer is the feature proposal.
Hi @PGijsbers,
I hope all is well with you. First and foremost, I would like to apologise for the recent spamming. I am just trying to do things correctly so that I can hopefully submit pull requests to enhance GAMA during the summer.
Now that my design is capable, thanks to GAMA and a few adjustments, of an end-to-end process utilising RandomSearch/ASHA as per the search algorithm, as well as being compatible with both the default BestFitPostProcessing and (after a few adjustments on our end) Ensembling, I am pleased to proceed with the design of the final block. A search algorithm implementation utilising Bayesian Optimisation.
Considering that SMAC3 utilises a random forest surrogate model - a fit for our current design - I've identified it as the most suitable library for this purpose. Moreover, the flexibility offered by SMAC3, with its plethora of configurations, aligns perfectly with GAMA's goal of being as versatile as possible, stands from its name.
However, I find myself unsure of how to approach this implementation. In GAMA's context, we deal with a collection of individuals, each representing a pipeline. The logical step would be to optimise the pipeline, aligning with what other search algorithms do. However, I'm specifically uncertain about integrating GAMA's design of Bayesian Optimisation while maintaining the overarching vision of GAMA. I aspire to approach your team with a potential pull request over the summer to augment GAMA's search methods, hence I want to ensure that my approach aligns perfectly with the established structure. Note that I intend to follow what the documentation says about implementing its own search algorithm.
I would greatly appreciate any advice or suggestions you could provide regarding this issue. I intend that I complete the development of this component's proof-of-concept by the end of June at the latest (i.e., it may not be as flexible as feasible, but it will demonstrate its viability; summertime enhancements to make the API more flexible would be planned).
Note overall, that this question-based issue mainly serves to propose a design that could potentially be incorporated into GAMA's future versions.
In the meantime, I am extremely appreciative of your assistance to date; your insights have been invaluable.
Looking forward to your response,
SMAC3: https://github.com/automl/SMAC3
Best wishes,