Huge amount of identical solutions in Pareto Set when implemented NSGA-II

Project-Platypus / Platypus

A Free and Open Source Python Library for Multiobjective Optimization

GNU General Public License v3.0

553 stars 152 forks source link

Huge amount of identical solutions in Pareto Set when implemented NSGA-II #186

Closed JixiangChen-Jimmy closed 1 year ago

JixiangChen-Jimmy commented 2 years ago

Hi, I am using NSGA-II algorithm of Platpys and I notice that it is very common to see identical solutions in the acquired approximate Pareto Set. e.g., before I remove the identical solutions in PS, I got 4000+ non-dominated solutions, while after the removal there are only 1000 solutions remain. Sometimes I got 4000 totally identical solutions from the algorithm.result. Each dimension of the identical solutions lies in the box-constrained boundary. And I found the cause of this is feature is the function clip in algorithm.py, this funciton is widely used in function evolve of manyVariator. clip would set x as the boundary if it violates the boundary constraint. def clip(value, min_value, max_value): return max(min_value, min(value, max_value)) And this easily causes the cumulative the identical solutions in algorithm.result. I think this feature should be avoid, is there a good way to replace the function clip?

JixiangChen-Jimmy commented 1 year ago

After careful check of the code, I think the effect of getting huge amounts of idetical solutions is not caused by funciton clip. Actually, the solutions are not identical but has very slight difference (e.g., difference < 1e-17). In NSGA-II, during calculating crowding distance of each non-dominated front, current population and the offspring are checked to delete the identical solutions via function unique in core.py. But the unique check is not applied to archive of the algorithm, this could lead to cumulative of identical solutions. So I tried to unique function to the unique function to self.archive, and added the accuracy constraint (1e-5) when juding whether solutions are identical. I got much better result than 4000 identical solutions with less than 1e-17 difference. But there are still identical solutions that repeated 2-3 times.. I am still confused about this phenomenon.

mich0611 commented 1 year ago

I also have the same issue (NSGAII and MOEA/D), for some benchmark problems I defined (CTP1-CTP7)

hh-wu commented 1 year ago

Why not define a new algorithm and change some operations in the NSGA2? I think you can create a new nsga2 class inherited from the original one and rewrite the iterate() function. For example:

    def iterate(self):
        offspring = []
        # new method avoids duplicated variables
        while len(offspring) < self.population_size:
            parents = self.selector.select(self.variator.arity,
                                           self.population)
            children = self.variator.evolve(parents)
            for child in children:
                if not child_in_offspring(child, offspring) and \
                        not child_in_offspring(child, self.evaluated_archive):
                    child.id_number = incr()
                    offspring.append(child)
                    if len(offspring) >= self.population_size:
                        break

Then you can compare either the variables or the objectives in child_in_offspring().

github-actions[bot] commented 1 year ago

This issue is stale and will be closed soon. If you feel this issue is still relevant, please comment to keep it active. Please also consider working on a fix and submitting a PR.