Integers/mixed data types

jhogg11 commented 2 years ago

I see the example here for discrete optimization (although it did not work for me as written): https://mealpy.readthedocs.io/en/latest/pages/general/advance_guide.html#discrete-optimization

If I'm understanding it correctly, amend_position in the example only works with integers, however, what if both integers and real values are needed? For example, feature selection and hyperparameter tuning could be done simultaneously. Feature selection would require binary values and hyperparameters could require mixed values.

I guess integers could be coerced from real values in the fitness function, but I'm wondering if there's a way to do it naturally with the library.

thieu1995 commented 2 years ago

@jhogg11 There is no natural way to handle discrete optimization in mealpy. All of the metaheuristic algorithms in this library are designed for continuous problems. If it is implemented in a natural way for discrete problems, it will destroy all algorithms' original designation. To deal with discrete problems, the trick here is using amend_position() to correct the solution for your problem.

At first, you should always have: solution = np.clip(solution, lb, ub). It will bring your solution back to the boundary. And then do whatever you want to make it satisfy the problems. For example, in a feature selection problem, you can use a function like rounding or the s-shape, u-shape, or whatever you want. Or hyper-parameter tuning problem, with mixed variables, you still can do it. Check out this example (https://github.com/thieu1995/mealpy/blob/master/examples/applications/sklearn/svm_classification.py)

Don't correct the solution in the fit_func(), it is not its job. This function's mission is to return the fitness value only. If you fix your solution in this function, it can still return the fitness value, but in the next iteration, the old solution is still real values, not the discrete value you want. You have to correct the solution in the amend_position function(). So in the next iterations, the old solution that is used to update the current solution is a discrete value.

Some repos you should take a look: https://github.com/thieu1995/mealpy-timeseries https://github.com/thieu1995/mealpy-text-classification https://github.com/thieu1995/mealpy-classification https://github.com/thieu1995/mealpy-feature-selection https://github.com/thieu1995/MHA-TSP

jhogg11 commented 2 years ago

Don't correct the solution in the fit_func(), it is not its job. This function's mission is to return the fitness value only. If you fix your solution in this function, it can still return the fitness value, but in the next iteration, the old solution is still real values, not the discrete value you want. You have to correct the solution in the amend_position function(). So in the next iterations, the old solution that is used to update the current solution is a discrete value.

This makes sense but it looks like both the SVM classification and feature selection examples you linked address categorical features and feature selection, respectively, in the fitness function. I see how integer features could be addressed in the Problem methods but I would think categorical features have to be addressed in the fitness function.

So does it really matter which is used? I would think that if the values are changed in the fitness function it might lead to different search behavior but it's probably hard to say whether that is disadvantageous without considering a specific problem.

I had previously created this class (updated with np.flatnonzero, which I didn't know about - thanks for that):

class FeatureSelector:

    def __init__(self, selected_features):
        self.selected_indices = np.flatnonzero(selected_features)

    def fit(self,X,y):
        return self

    def transform(self, X):
        X = np. asarray(X)[:,self.selected_indices]
        return X

Which could be used with an sklearn Pipeline similar to this:

selected_features = [1,1,0,1,0,0,0,1....]
model = RandomForestRegressor()
model = Pipeline([
    ('selector', FeatureSelector(selected_features)),
    ('model', model)
])

So given that I'm addressing the filtering out of features through other means, I think this would work as a general solution for a mix of integers and floats. Let me know if you agree.

class MixedTypeProblem(Problem):

    def __init__(self, lb=None, ub=None, types=None, minmax="min", **kwargs):
        super().__init__(lb=lb, ub=ub, types=types, minmax=minmax, **kwargs)
        self.types = types

    def generate_position(self, lb=None, ub=None):

        position = []

        for i, t in enumerate(self.types):
            if t is int:
                p = np.random.randint(lb[i],ub[i]+1)
            else:
                # defaults to float
                p = np.random.uniform(lb[i], ub[i])

            position.append(p)

        return np.asarray(position)

Mixed types:

problem_dict = {
    "fit_func": lambda x: x.sum(), # dummy function
    "lb": [0, 0, -1, 1],
    "ub": [1, 1, 1, 20],
    "types": [int, float, float, int]    
}

MixedTypeProblem(**problem_dict)

Feature selection only:

n_features = 10
problem_dict = {
    "fit_func": lambda x: x.sum(), # dummy function
    "lb": [0] * n_features,
    "ub": [1] * n_features,
    "types": [int] * n_features    
}

MixedTypeProblem(**problem_dict)

A couple of unrelated questions:

Do you have any algorithms that you typically use or certain situations where you might pick one versus another?
What are your thoughts on Bayesian optimization compared to these methods. I found this library after looking for an implementation of the Flower Pollination Algorithm but just experimenting some, the Levy ES algorithm might be even better. I've looked at skopt some but haven't really made great use of it yet (there seems to be a learning curve with these algorithms). Just wondering if you have any thoughts on how something like FPA or evolution-based algorithms compare to Bayesian optimization.

This looks like a really incredible library. I'm sure it will get a lot of recognition.

thieu1995 commented 2 years ago

Hi @jhogg11 ,

1) Like I said in the previous reply (For discrete variables):

(1) If you update the position of the solution in the amend_position() function. Then the updated value will be saved and used in the next iterations. (2) If you update the solution's position in the fit_func() function. Then the updated value will not be saved. It is only for calculating fitness.

For example:

X1(iter=4) = 0.732 X1(iter=5) = X(iter=4) + random

In the (1) case: X1(iter=5) = 1 + random In the (2) case: X1(iter=5) = 0.732 + random

==> So you see, the X1 in the next iterations will be affected by the previous X1. Of course, it will lead to different search behavior. However, it really depends on the specific problem.

2) For the class MixedTypeProblem(Problem) that you wrote above. It seems like using this class can solve the problem above. However, it is not, and this can solve the problem in the initialization phase, not in the updated phases of each algorithm. For example: the updated equations in Whale Optimization: https://github.com/thieu1995/mealpy/blob/master/mealpy/swarm_based/WOA.py#L78

It is still attached with float values such as a and r, A, and C. So the values will also be a float, not an integer.

You have to re-design updated equations in order to solve the discrete problem. For example https://www.cs.montana.edu/sheppard/pubs/gecco-2016a.pdf That is why I said previously that the purpose of meta-heuristic algorithms is for continuous domains.

3) I always try Whale Optimization first before thinking about other algorithms. Because it is easy and only contains 2 parameters. However, If you want better results for your problems, it would help if you tried some better algorithms. You can try some algorithms that are categorized as good (in the Performance column) in this table: https://mealpy.readthedocs.io/en/v2.4.1/pages/support.html#classification-table

4) My thoughts on Bayesian Optimization (BO): It belongs to Approximation Algorithms. It is quite the same as a Meta-heuristic Algorithm (MHA) but based on the Bayesian Theory (They are global optimization algorithm) MHAs' fitness function can be considered a Surrogate Function in BO. The updating phase + amend_position() in MHAs can be considered an Acquisition Function in BO. But you know, people are more into complex and mathematical algorithms. That would make them produce paper more easily. (Ref: https://www.youtube.com/watch?v=ECNU4WIuhSE)

jhogg11 commented 1 year ago

Thanks, Whale Optimization does seem effective. I tried a few: FPA, GWO, CGO, etc. based on what I could google about which are highly effective, but I have a very slow objective function. If you have any other suggestions, I'll try those as well.

Also, is it possible to continue optimizing on an existing optimizer? I tried calling solve multiple times and it seems to start over when called more than once.

thieu1995 commented 1 year ago

@jhogg11 ,

You can try SHADE and LSHADE in DE module.

Currently, we don't support that kind of training process. Every time, you call the solve function. It will reset the algorithm.

thieu1995 / mealpy

Integers/mixed data types #85