Open naotokui opened 2 months ago
In Novelty Search, you want to promote genes that are in sparse regions, meaning those that are far from their neighbors. This is why largest=True is correct, as it ensures you're selecting the genes with the largest average distance to their nearest neighbors. If you set largest=False
, you would be selecting genes in dense regions, which goes against the principle of novelty.
You're also correct in the second point regarding the topk(mean_distances, self.container_size, largest=True)
. This step is intended to select the genes in the most sparse neighborhoods (i.e., those with the highest mean distances), which aligns with the goal of NS to explore new and underrepresented areas of the solution space. Therefore, largest=True is necessary here as well to identify the most novel (sparse) genes.
Thanks for your reply! I'm still confused... in this paper, the originators of NS state "A simple measure of sparseness at a point is the average distance to the k-nearest neighbors of that point, where k is a fixed parameter that is determined experimentally."
Joel Lehman, Kenneth O. Stanley. 2005. “Evolving a Diversity of Creatures through Novelty Search and Local Competition.” Evolutionary Computation 13 (2): 241–77.
Isn't it too "easy" to have sparsity if you compare the farthest points? Any comments?
Thanks for sharing exciting research and the repo!
I have one question on the implementation of NS. Here, you take topk with largest=True. I believe that "largest" should be False, because you should calculate the average distance among the nearest neighbors of the particular gene. https://github.com/fisheggg/LVNS-RAVE/blob/95aa9d502b0fddb1377a97e6208070fdd2cfb05d/eprior/eprior/core.py#L262
Then do
topk(mean_distances, self.container_size, largest=True)
to find genes in the most sparce neighborhoods, right?I'm new to NS, so forgive me if I missed something. Thanks!