Add ProximityArchive for novelty search

gresavage commented 2 weeks ago

Description

Implement the ProximityArchive (also known as unstructured archive or novelty archive) from Novelty Search (Lehman, 2011) http://eplex.cs.ucf.edu/papers/lehman_ecj11.pdf

Resolves #468

Naming

ProximityArchive is a specific name that reflects how the archive operations are based on how close solutions are to each other in measure space. We also considered NearestNeighborArchive, but that name is a bit long. We find that NoveltyArchive and UnstructuredArchive are rather imprecise terms.

TODO

[x] Add tests -> also added python-box as a dev dependency in setup.py to make tests easier
[x] Implement class
[x] Allow passing in objective=None -> objective will then default to 0
[x] Add tests for scheduler with None objective (also fixed handling of None objectives for add_mode="single")
[x] Tidy up TODOs in class (i.e., clean up code)

Status

[x] I have read the guidelines in CONTRIBUTING.md
[x] I have formatted my code using yapf
[x] I have tested my code by running pytest
[x] I have linted my code with pylint
[x] I have added a one-line description of my change to the changelog in HISTORY.md
[x] This PR is ready to go

btjanaka commented 1 week ago

Hi @gresavage, just wanted to give a quick update on things. I'm going through and modifying the UnstructuredArchive -- I've changed it a decent amount to align more with the rest of the library and how we implement things. I removed GridUnstrucutredArchive and ArrayStore.roll as per my earlier comments, but do consider taking a look at my comment here https://github.com/icaros-usc/pyribs/pull/472#discussion_r1649428413 as I think it reveals an unexpected behavior. I'll keep working on this bit by bit, and it should be ready soon. Thanks again!

gresavage commented 1 week ago

Hi @btjanaka I'm eager to see what you've done.

I haven't pushed any of my changes w.r.t NS + local competition yet as they are still in flux and I'd like to know your thoughts on handling non-dominated sorting... I also want to fold my changes into what you've got so the two strategies are homogeneous if you decide to include NS + local competition.

I was also thinking more about the objective threshold and learning rate. Are we sure we want to do away with this behavior entirely if we implement NS + local competition? I removed the keyword arguments in my last commit but I think it could be useful to have CMA-MAE behavior with the local competition variant.

btjanaka commented 1 week ago

Hi @btjanaka I'm eager to see what you've done.

I haven't pushed any of my changes w.r.t NS + local competition yet as they are still in flux and I'd like to know your thoughts on handling non-dominated sorting... I also want to fold my changes into what you've got so the two strategies are homogeneous if you decide to include NS + local competition.

I was also thinking more about the objective threshold and learning rate. Are we sure we want to do away with this behavior entirely if we implement NS + local competition? I removed the keyword arguments in my last commit but I think it could be useful to have CMA-MAE behavior with the local competition variant.

@gresavage I'll reply to both your comments here.

To implement NSLC in pyribs, I think we would create an emitter that contains the MOEA, most likely using one of the libraries you mentioned (indeed, it would likely be out of scope to create NDS algorithms in pyribs unless there is some custom feature we want). The archive would then output the novelty and local competition via the add_info returned from the add() method, and the emitter would use the NDS to rank based on the novelty and local competition. Alternatively, if the emitter needs to also include the population in novelty and local competition computations, then it can compute those metrics inside of the emitter rather than in the archive.

Regarding the archive, I took another look at the NSLC paper, and I think one point that may be confusing us (at least me) is whether solutions are replaced based on their local competition. My understanding is that NSLC just uses a regular archive identical to the one from novelty search. The local competition seems to be a metric that is computed using the solutions in the archive, but it does not seem the metric plays any role in updating the archive. If you saw a section or piece of code saying otherwise, could you send it along? I've also asked some friends in the QD community for their thoughts on this; I'm genuinely not sure how this part works.

If I am right, then the current archive would be roughly what we need for both NS and NSLC, and the job of implementing NSLC would likely involve making a new emitter as I described above.

gresavage commented 1 week ago

Hi @btjanaka I'm eager to see what you've done. I haven't pushed any of my changes w.r.t NS + local competition yet as they are still in flux and I'd like to know your thoughts on handling non-dominated sorting... I also want to fold my changes into what you've got so the two strategies are homogeneous if you decide to include NS + local competition. I was also thinking more about the objective threshold and learning rate. Are we sure we want to do away with this behavior entirely if we implement NS + local competition? I removed the keyword arguments in my last commit but I think it could be useful to have CMA-MAE behavior with the local competition variant.

@gresavage I'll reply to both your comments here.

To implement NSLC in pyribs, I think we would create an emitter that contains the MOEA, most likely using one of the libraries you mentioned (indeed, it would likely be out of scope to create NDS algorithms in pyribs unless there is some custom feature we want). The archive would then output the novelty and local competition via the add_info returned from the add() method, and the emitter would use the NDS to rank based on the novelty and local competition. Alternatively, if the emitter needs to also include the population in novelty and local competition computations, then it can compute those metrics inside of the emitter rather than in the archive.

Regarding the archive, I took another look at the NSLC paper, and I think one point that may be confusing us (at least me) is whether solutions are replaced based on their local competition. My understanding is that NSLC just uses a regular archive identical to the one from novelty search. The local competition seems to be a metric that is computed using the solutions in the archive, but it does not seem the metric plays any role in updating the archive. If you saw a section or piece of code saying otherwise, could you send it along? I've also asked some friends in the QD community for their thoughts on this; I'm genuinely not sure how this part works.

If I am right, then the current archive would be roughly what we need for both NS and NSLC, and the job of implementing NSLC would likely involve making a new emitter as I described above.

RE: Emitters

I think an emitter could work for NSLC so long as the ranking doesn't play a roll in what gets added to the archive. The only real hurdle I see with that route is that NSLC doesn't bother with how candidates are generated, so an NSLC emitter could essentially wrap any of the other emitters and just leave the ask method unchanged. In my eyes this means making an NSLC variant for each of the other emitters which is only a hurdle inasmuch as it is tedious :laughing: . The tell method would be where the magic happens.

RE: Replacement & Use of NDS Rank

I looked over the papers as well and I think you're right that solutions are not replaced - which greatly simplifies the issue.

So after thinking about it even more, I think the way the archive works in NSLC is that essentially every member of the current population is added to the archive if it survives the genetic selection. And a solution only survives selection based on NSGA-II optimizing morphological novelty and local competition objective plus novelty playing the role of crowding distance. So I think the archive in NSLC doesn't actually do any determination of what gets added - if it is sent to the add method it gets added.

I also think whether the "selection" based on non-domination rank and novelty for NSLC is done outside of PyRibs by a user algorithm or in the add method is kind of a "6 one way, half dozen the other" situation. So I am happy to assist with whatever path you think is best and within scope.

btjanaka commented 6 days ago

@gresavage I see what you are saying about using the NSLC ranking. I think a good path would be to implement something in ribs.emitters.rankers. Essentially, the ranker would integrate an NDS, and the ranker could be used in emitters like EvolutionStrategyEmitter. Meanwhile, the archive could return local_competition in add_info. The emitter would take in the novelty and local_competition from the add_info and pass these pieces of info to the ranker.

For now, I've updated the code to just be a regular Novelty Search archive. I added a TODO to add local competition on issue #474. Would you mind taking a look at the code? Thanks!

gresavage commented 6 days ago

@gresavage I see what you are saying about using the NSLC ranking. I think a good path would be to implement something in ribs.emitters.rankers. Essentially, the ranker would integrate an NDS, and the ranker could be used in emitters like EvolutionStrategyEmitter. Meanwhile, the archive could return local_competition in add_info. The emitter would take in the novelty and local_competition from the add_info and pass these pieces of info to the ranker.

For now, I've updated the code to just be a regular Novelty Search archive. I added a TODO to add local competition on issue #474. Would you mind taking a look at the code? Thanks!

@btjanaka AHHHH, rankers, of course! Clever :)

I will take a look.

gresavage commented 5 days ago

@btjanaka Do you think we need to implement visualization for the archive?

btjanaka commented 5 days ago

@btjanaka Do you think we need to implement visualization for the archive?

Yes, I added a TODO on #474

gresavage commented 5 days ago

@btjanaka Do you think we need to implement visualization for the archive?

Yes, I added a TODO on #474

Apologies! I literally just opened that up :grimacing:

icaros-usc / pyribs