anyoptimization / pymoo

NSGA2, NSGA3, R-NSGA3, MOEAD, Genetic Algorithms (GA), Differential Evolution (DE), CMAES, PSO
https://pymoo.org
Apache License 2.0
2.28k stars 392 forks source link

Problem with the result when running vectorized #372

Closed davide-q closed 1 year ago

davide-q commented 1 year ago

The following code, when running vectorized as follows produced incorrect values in the Result object. I know it's unorthodox, it is so just because I am trying to make it as simple and as demonstrative as possible, so please bear with me.

import numpy as np
from pymoo.core.problem import Problem
import subprocess

tot_var = 2
class MyProblem(Problem):

    def __init__(self, **kwargs):
        super().__init__(n_var=tot_var,
                         n_obj=3,
                         n_ieq_constr=0,
                         xl=np.array([0]*tot_var),
                         xu=np.array([1]*tot_var),
                         elementwise=False,
                         **kwargs,
                        )
        self.a = 1

    def _evaluate(self, x, out, *args, **kwargs):
        return self.vanilla_looper(x, out, *args, **kwargs)

    def vanilla_looper(self, x, out, *args, **kwargs):
        out["F"] = []
        if len(np.array(x).shape) < 2:
            out["F"].append(self.vanilla(x))
        else:
            for xi in x:
                out["F"].append(self.vanilla(xi))
        print(np.array(x).shape)
        print(x)
        print("------------")
        print(np.array(out["F"]).shape)
        print(out)

    def vanilla(self, x):
        f1 = 0
        f2 = 0
        prev = 0
        prev_prev = 0
        for xi in x:
            f1 += -xi * xi
            f2 += -(xi - prev) * (xi - prev)
            prev_prev = prev
            prev = xi

        c = self.a
        self.a += 1
        return [f1, f2, c]

from pymoo.algorithms.moo.nsga2 import NSGA2
from pymoo.operators.crossover.sbx import SBX
from pymoo.operators.mutation.pm import PM
from pymoo.operators.sampling.rnd import FloatRandomSampling

algorithm = NSGA2(
    pop_size=10,
    n_offsprings=5,
    sampling=FloatRandomSampling(),
    crossover=SBX(prob=0.9, eta=15),
    mutation=PM(eta=20),
    eliminate_duplicates=True
)

from pymoo.termination import get_termination
termination = get_termination("n_gen", 2)

problem = MyProblem()
from pymoo.optimize import minimize
res = minimize(problem,
               algorithm,
               termination,
               seed=15,
               save_history=True,
               verbose=True,
              )

X = res.X
F = res.F
print("F", F)

As you can see below res.F does not have integers as it should and its value are all over the place and do not match existing individuals which ran in _evaluate(). Running with elementwise=True the values of res.F correctly have integers as third elements of each individual and each individual matches one which was printed earlier during the run.

$ python pymoo_example.py
(10, 2)
[[0.8488177  0.17889592]
 [0.05436321 0.36153845]
 [0.27540093 0.53000022]
 [0.30591892 0.30447436]
 [0.11174128 0.24989901]
 [0.9176299  0.26414685]
 [0.71777369 0.86571503]
 [0.80707948 0.21055058]
 [0.16724303 0.04670639]
 [0.03942231 0.20023081]]
------------
(10, 3)
{'F': [[-0.752495235149672, -1.1692866642616555, 1], [-0.13366540706961388, -0.09731198208992434, 2], [-0.3567459098664754, -0.14066647314813482, 3], [-0.18629101838736462, -0.09358846974875643, 4], [-0.07493562993960418, -0.03157367328898142, 5], [-0.9118181894871994, -1.269084718961254, 6], [-1.2646615868522457, -0.5370857084977069, 7], [-0.6957088385194358, -1.0072240187034014, 8], [-0.030151718594700706,
-0.04249931309883876, 9], [-0.04164649584764699, -0.027413491699102382, 10]], 'G': None, 'H': None}
==========================================================
n_gen  |  n_eval  | n_nds  |      eps      |   indicator
==========================================================
     1 |       10 |      3 |             - |             -
(5, 2)
[[0.04595573 0.40256219]
 [0.28639973 0.24133409]
 [0.35003049 0.37363382]
 [0.68997055 0.82468418]
 [0.84878927 0.51602767]]
------------
(5, 3)
{'F': [[-0.1641682458249522, -0.12928009311806177, 11], [-0.14026695286876958, -0.08405571985151775, 12], [-0.26212357096942057, -0.12307845860620242, 13], [-1.1561633592017773, -0.49420712255107857, 14], [-0.9867277915171095, -0.8311735170627691, 15]], 'G': None, 'H': None}
     2 |       15 |      4 |  0.1198976524 |             f
F [[-1.26466159 -0.69570884 -0.03015172]
 [-0.0416465  -1.16928666 -0.09731198]
 [-1.26908472 -0.53708571 -1.00722402]
 [-1.15616336 -0.98672779 -0.12928009]]

Am I doing something completely wrong here, or is this a bug?

blankjul commented 1 year ago

I am not sure if I can follow. Why should in your example res.F only have integers? What is your expected result?

davide-q commented 1 year ago

My actual problem (not shown here) is very complicated and works fine when not using vectorization. When using vectorization the res.F returns what it looks to me as random stuff rather than my expected values. Since my actual problem is too complicated, to debug the issue I made this simple, even though artificial example. Like my actual problem, this example works as expected for the non-vectorized example and fails for the vectorized case.

In this example, I have three objectives, which I return as f1, f2 and c. The latter is always an integer, which is obvious from the code and confirmed by the printing the out at each _evaluate call (at the end of the vanilla_looper), which is also quoted above. This is the key, is it clear? If not I can try to express it by going line-by-line into the code and the corresponding output.

Since the third optimization objective is always returned as an integer, for each individual in the population, their third objective must always be integer as well. In particular, this should still hold for the final population, which is the one returned by the results object, if my understanding is correct. This indeed is what happens if I run the code above, just changing the elementwise option to True. Leaving it False as it is above, I get the results I mentioned, for which the third objective is not an integer and therefore demonstrates the "random stuff" being returned by res.F when running without elementwise.

Is the problem clear now?

Thanks for taking a look

davide-q commented 1 year ago

Also, flipping the sign of f2 in the vanilla function provides additional insight: as it can be seen during the run, all the second optimization values will be positive, with typical values for out["F"] returned by _evaluate being of the form

[[-0.1641682458249522, 0.12928009311806177, 11],
[-0.14026695286876958, 0.08405571985151775, 12],
[-0.26212357096942057, 0.12307845860620242, 13],
[-1.1561633592017773, 0.49420712255107857, 14],
[-0.9867277915171095, 0.8311735170627691, 15]]

On the other hand, most of the res.F values continue to be negative! To me, it looks like the reshaping (which often causes the dreadful Problem Error: F can not be set, expected shape (40, 2) but provided (10, 2)', ValueError('cannot reshape array of size 20 into shape (40,2)') either has a bug, or I have greatly misunderstood how it works.

As an additional test, I even tried returning the transpose of the values mentioned above for out["F"], which obviously would be things like

{'F': array([[-1.5041332 , -2.94398683, -3.62966778, -2.41202024, -1.37642159,
        -3.17929202, -3.05533535, -4.05918281, -2.70313611, -2.75393152],
       [ 1.45784028,  1.89835058,  2.89901847,  0.37115145,  1.60686834,
         2.12178904,  0.99780403,  2.5783593 ,  1.08876212,  1.32781543],
       [ 1.        ,  2.        ,  3.        ,  4.        ,  5.        ,
         6.        ,  7.        ,  8.        ,  9.        , 10.        ]]), 'G': None, 'H': None}

Yet, the values of the res.F are exactly identical!!

Again, setting the elementwise value to True solves this problem, however given the size of my actual research, that is a useless workaround because it is orders of magnitude too slow.

blankjul commented 1 year ago

Does the code below solve your problem?

out["F"] needs to be a numpy array and of shape (len(x), n_obj). With a simple row stack after doing your loop-wise operations, it should work.


from pymoo.core.problem import Problem

class MyProblem(Problem):

    def __init__(self, **kwargs):
        super().__init__(n_var=2,
                         n_obj=3,
                         n_ieq_constr=0,
                         xl=0.0,
                         xu=1.0,
                         elementwise=False,
                         **kwargs,
                        )
        self.a = 1

    def _evaluate(self, x, out, *args, **kwargs):
        return self.vanilla_looper(x, out, *args, **kwargs)

    def vanilla_looper(self, x, out, *args, **kwargs):
        F = []
        if len(np.array(x).shape) < 2:
            F.append(self.vanilla(x))
        else:
            for xi in x:
                F.append(self.vanilla(xi))

        out["F"] = np.row_stack(F)

        print("X")
        print(np.array(x).shape)
        print(x)
        print("------------")
        print("F")
        print(out["F"].shape)
        print(out["F"])

    def vanilla(self, x):
        f1 = 0
        f2 = 0
        prev = 0
        for xi in x:
            f1 += -xi * xi
            f2 += -(xi - prev) * (xi - prev)
            prev = xi

        c = self.a
        self.a += 1
        return [f1, f2, c]

from pymoo.algorithms.moo.nsga2 import NSGA2
from pymoo.operators.crossover.sbx import SBX
from pymoo.operators.mutation.pm import PM
from pymoo.operators.sampling.rnd import FloatRandomSampling

algorithm = NSGA2(
    pop_size=10,
    n_offsprings=5,
    sampling=FloatRandomSampling(),
    crossover=SBX(prob=0.9, eta=15),
    mutation=PM(eta=20),
    eliminate_duplicates=True
)

from pymoo.termination import get_termination
termination = get_termination("n_gen", 2)

problem = MyProblem()
from pymoo.optimize import minimize
res = minimize(problem,
               algorithm,
               termination,
               seed=15,
               save_history=True,
               verbose=True,
              )

X = res.X
F = res.F
print("F", F)
davide-q commented 1 year ago

Hi @blankjul thank you so much for your patience and assistance. This now works exactly as I expected.

As far as I am concerned, you may close this ticket, unless you may want to keep it open as a reminder to add a sentence or two in https://pymoo.org/problems/definition.html?highlight=elementwise#Problem-(vectorized) about it, to avoid other newbies like me making this mistake again.

In any case, thanks again for your help with this problem and for maintaining such a nice tool as pymoo for the community!!!

blankjul commented 1 year ago

You are welcome!

Can you do me the favor and add a few sentences to the documentation (so that you would have understood it in the first place).

You can find it here: https://github.com/anyoptimization/pymoo/blob/0da32410422604678cc415e4fca32da36208c2cc/docs/source/problems/definition.ipynb

I am more than happy to look at your PR. Thanks!