fisheggg / LVNS-RAVE

8 stars 0 forks source link

Question on the Novelty Search implementation #1

Open naotokui opened 2 months ago

naotokui commented 2 months ago

Thanks for sharing exciting research and the repo!

I have one question on the implementation of NS. Here, you take topk with largest=True. I believe that "largest" should be False, because you should calculate the average distance among the nearest neighbors of the particular gene. https://github.com/fisheggg/LVNS-RAVE/blob/95aa9d502b0fddb1377a97e6208070fdd2cfb05d/eprior/eprior/core.py#L262

Then do topk(mean_distances, self.container_size, largest=True) to find genes in the most sparce neighborhoods, right?

I'm new to NS, so forgive me if I missed something. Thanks!

marcoaccardi commented 1 month ago

In Novelty Search, you want to promote genes that are in sparse regions, meaning those that are far from their neighbors. This is why largest=True is correct, as it ensures you're selecting the genes with the largest average distance to their nearest neighbors. If you set largest=False, you would be selecting genes in dense regions, which goes against the principle of novelty.

You're also correct in the second point regarding the topk(mean_distances, self.container_size, largest=True). This step is intended to select the genes in the most sparse neighborhoods (i.e., those with the highest mean distances), which aligns with the goal of NS to explore new and underrepresented areas of the solution space. Therefore, largest=True is necessary here as well to identify the most novel (sparse) genes.

naotokui commented 1 month ago

Thanks for your reply! I'm still confused... in this paper, the originators of NS state "A simple measure of sparseness at a point is the average distance to the k-nearest neighbors of that point, where k is a fixed parameter that is determined experimentally."

Joel Lehman, Kenneth O. Stanley. 2005. “Evolving a Diversity of Creatures through Novelty Search and Local Competition.” Evolutionary Computation 13 (2): 241–77.

Screenshot 2024-09-16 at 12 37 49

Isn't it too "easy" to have sparsity if you compare the farthest points? Any comments?