Open RomainGoussault opened 4 years ago
@HeytemBou I think you are introducing this in your DRFA PR right?
The attribute has been added by @HeytemBou in PR #354 (as a Scenario()
attribute), but it remains to adapt standard multi-partner learning approaches to leverage it.
@arthurPignet indicates that there is a similar mechanism in PVRL - to be checked, see if some can be reused and/or upgraded when we implement this fraction of partners approach. @HeytemBou can you complement this issue with elements of thoughts from your work on DRFA please?
Indeed PVRL is a contributivity method where an agent trained by reinforcement learning (policy gradient) would choose a subset of partners at every epoch. Currently one mpl object is created, and the method .fit_epoch is called. The mpl.fit method is kind of overwritten within the contributivity function PVRL. Every epoch the mpl.partner_list is changed (and the aggregator is re-initialized). It could be really interesting to rewrite PVRL with your new tool @HeytemBou. Actually it could be easy to test various RL algorithms.
This is useful when the number of partners is large