MSDLLCpapers / obsidian

Algorithmic process optimization and AI experiment design
https://msdllcpapers.github.io/obsidian/
GNU General Public License v3.0
23 stars 3 forks source link

AttributeError: 'Param_Discrete_Numeric' object has no attribute 'search_categories' #69

Open dhristozov opened 1 month ago

dhristozov commented 1 month ago

Hi,

Thanks for the nice package.

I am encountering issues when trying to use Param_Discrete_Numeric.

If I understand the code correctly the idea is to use this as a continuous variable during optimisation and mapping the suggested evaluation values to the closest of the numerical categories (via the unit_demap method).

However, by the virtue of Param_Discrete_Numeric inheriting from Param_Discrete this seems to be broken and I get the following exception when trying to use Param_Discrete_Numeric.

params = [
    Param_Categorical("Category", ["Cat-1", "Cat-2", "Cat-3"]),
    Param_Discrete_Numeric("Temperature", list(range(25, 86, 5))),
]
X_space = ParamSpace(params)
target = [
    Target('Desired', aim='max'),
    Target('Undesired', aim='min')
]
campaign = Campaign(X_space, target, seed=42)
X0 = campaign.designer.initialize(4, 'LHS')
Z0 = pd.concat([X0, pd.Series([35.,56.,67.,23.], name="Desired"), pd.Series([60.,48.,27.,70.], name="Undesired")], axis=1)
campaign.add_data(Z0)
campaign.fit()
X_suggest, eval_suggest = campaign.optimizer.suggest(
    acquisition = ['NEHVI', ], m_batch=4
)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[110], [line 15](vscode-notebook-cell:?execution_count=110&line=15)
     [13](vscode-notebook-cell:?execution_count=110&line=13) campaign.add_data(Z0)
     [14](vscode-notebook-cell:?execution_count=110&line=14) campaign.fit()
---> [15](vscode-notebook-cell:?execution_count=110&line=15) X_suggest, eval_suggest = campaign.optimizer.suggest(
     [16](vscode-notebook-cell:?execution_count=110&line=16)     acquisition = ['NEHVI', ], m_batch=4
     [17](vscode-notebook-cell:?execution_count=110&line=17) )

File [~/dev/obsidian/obsidian/optimizer/bayesian.py:711](~/dev/obsidian/obsidian/optimizer/bayesian.py:711), in BayesianOptimizer.suggest(self, m_batch, target, acquisition, optim_sequential, optim_samples, optim_restarts, objective, out_constraints, eq_constraints, ineq_constraints, nleq_constraints, task_index, fixed_var, X_pending, eval_pending)
    [708](~/dev/obsidian/obsidian/optimizer/bayesian.py:708)     raise TypeError('Each item in acquisition list must be either a string or a dictionary')
    [710](~/dev/obsidian/obsidian/optimizer/bayesian.py:710) # Compute static variable inputs
--> [711](~/dev/obsidian/obsidian/optimizer/bayesian.py:711) fixed_features_list = self._fixed_features(fixed_var)
    [713](~/dev/obsidian/obsidian/optimizer/bayesian.py:713) # Set up the sampler, for MC-based optimization of acquisition functions
    [714](~/dev/obsidian/obsidian/optimizer/bayesian.py:714) if not isinstance(model, ModelListGP):

File [~/dev/obsidian/obsidian/optimizer/base.py:114](~/dev/obsidian/obsidian/optimizer/base.py:114), in Optimizer._fixed_features(self, fixed_var)
    [112](~/dev/obsidian/obsidian/optimizer/base.py:112) for x in self.X_space.X_discrete:
    [113](~/dev/obsidian/obsidian/optimizer/base.py:113)     if x.name not in fixed_var.keys():  # Fixed_var should take precedent and lock out other combinations
--> [114](~/dev/obsidian/obsidian/optimizer/base.py:114)         df_i = pd.DataFrame({x.name: x.search_categories})
    [115](~/dev/obsidian/obsidian/optimizer/base.py:115)         df_list.append(df_i)
    [117](~/dev/obsidian/obsidian/optimizer/base.py:117) # Merge by cross

AttributeError: 'Param_Discrete_Numeric' object has no attribute 'search_categories'
xuyuting commented 1 month ago

@dhristozov Hi, thank you for identifying this issue! Could you try the updated code (main branch) again and let us know if it works now?

dhristozov commented 1 month ago

@xuyuting , thanks for the quick fix. I can confirm the exception is gone now. However, looking at the code, I am not sure I understand the purpose of the Param_Discrete_Numeric class. Now (with the code above) I get a warning:

The combinations of discrete features is large at 39. Optimization will proceed very slowly due to the combinatorial explosion. Recommend reducing the number of discrete parameters used.

Which means that my Param_Discrete_Numeric is treated as a categorical variable by the optimiser. If that's the case, may you please explian what is the difference between it and Param_Categorical? E.g., how is Param_Categorical("category", ["1", "2"]) different from Param_Discrete_Numeric("num_category", [1,2]) when it comes to surrogate model building and acquisition optimisation? Thanks!