jafermarq commented 1 year ago

FedPG-BR

Title: Fault-Tolerant Federated Reinforcement Learning with Theoretical Guarantee
Venue: NeurIPS 2021
Link to paper: https://openreview.net/forum?id=ospGnpuf6L

Do you want to work on this baseline?

🌻 Check everything about the Summer of Reproducibility on flower.dev/summer

All available baselines are listed in the Summer of Reproducibility Dashboard and also in the GitHub Issues with the summer-of-reproducibility label. The content is the same.

📝 It is advised to complete these steps before your start working on your code. But if you can't wait to implement your baseline with Flower (we totally understand it 😄), please ensure you follow the steps on how to contribute a new baseline.

What follows are the steps 1 & 2 in the Summer of Reproducibility instructions.

1. Join the Summer of Reproducibility program

[x] Join the Flower Slack and say "hi! 👋" in the channel #summer-of-reproducibility.
[x] Pick a baseline from our curated list <---------------------------------------- [you are doing this now]
2. Define the scope of your contribution
[x] What are you going to reproduce? Add a comment to your issue and tell us about your plan regarding this baseline: what experiments from the paper are you reproducing?, for which datasets ? the more details you provide us with the better !
[x] Check if you are eligible for a reward.

As we have to comply with US/EU regulations, we have checked that individual contributors based on these countries or territories are eligible: Australia, Austria, Belgium, Bulgaria, Canada, Croatia, Cyprus, Czech Republic, Denmark, Estonia, Finland, France, Germany, Gibraltar, Greece, Hong Kong SAR China, Hungary, India, Ireland, Italy, Japan, Latvia, Liechtenstein, Lithuania, Luxembourg, Malta, Mexico, Netherlands, New Zealand, Norway, Poland, Portugal, Romania, Singapore, Slovakia, Slovenia, South Korea, Spain, Sweden, Switzerland, Thailand, United Arab emirates, United Kingdom United States.

If where you are based is not on the list, please send us an email (summer@flower.dev) letting us know a bit about yourself (where are you currently based?, are you a university student? do you work at a public institution?). Please tell us the baselines you are interested in implementing (i.e. tell us your GitHub issue if you have crated one). We will reach back to you.
[ ] We will discuss with you about your contribution plan, if it sounds like a substantial enough contribution according to the Summer of Reproducibility rules (check our website flower.dev/summer), you'll get the OK from us to start working on your baseline!

What happens next?

[ ] This item will be moved to the In Progress stage by a member of the Flower Team.
[ ] Follow the instructions for creating a new baseline which will guide you through the process step-by-step.

Is something wrong or not clear ?

Ask a question directly in your issue.
Reach out to us via the Flower Slack and ask your question in the #summer-of-reproducibility channel
Check all the details (including FAQ) in the Summer of Reproducibility website: flower.dev/summer

flint-xf-fan commented 11 months ago

Hi @jafermarq

This is the contribution plan I'd like to propose:

Integrating OpenAI gymnasium into flower for conducting Federated Reinforcement Learning experiments
- It seems unexplored how to conduct RL experiments in flower, which I think would be interesting and challenging
- If any documentation is available that provides hints on integrating with RL, I would greatly appreciate it if you could link me to it
Implementing FedPG-BR (without BR (Byzantine Resilience) Filter, on the server side)
- Implementation of REINFORCE on the client (agent)-side
- Implementation of FedPG-BR on the server side
- The objective is to reproduce the experimental results of boosting the sample efficiency of individual agents (Figure 2 in the paper)
(optionally?) Implementing Byzantine agents and the BR Filter at the server side

Let me know your thoughts and yes, I am eligible for the reward.

Cheers, Flint

jafermarq commented 11 months ago

Hi @flint-xf-fan, thanks for getting back to us with this detailed plan. Reproducing the results in Figure 2 sounds good. I'll look into how integrating OpenAi's gymnasium can be done with how Flower's server <--> client interactions work. Please give me a couple of days and i'll get back to you regarding this. I'll ✅ some the points above in the description of the issue and add you as assignee.

jafermarq commented 11 months ago

Hi @flint-xf-fan , after looking into your paper more closely: wouldn't it be sufficient if all clients and the server have their own gym instantiation? Then, the clients sample a batch of trajectories independently, obtain gradients and communicate them to the server. Most examples in Flower show how to send the entire model back to the server, but adjusting it to instead communicate the gradients isn't difficult (it just requires adjusting the get_parameters() method in the flower clients). Then the server updates the policy. My understanding is that the server does also need a gym instance to do line 11 in Algo1.

If having multiple gym instances is a problem (maybe due to compute/memory resources needed to support that), we could think of alternatives.

flint-xf-fan commented 10 months ago

Hi @jafermarq, that sounds the right approach. I have gone through some tutorials on Flower, and I would like to start the implementation, as you suggested:

Each agent as a Flower client initializes one gym env
- This approach also gives potentials on heterogenous environments
Each agent runs REINFORCE independently
The server initializes another environment and communicates with agents via gradients
Log stats using Flower’s logger

yining043 commented 9 months ago

Hi @jafermarq I am another author of FedPG-BR and am working with @flint-xf-fan on this. We are planing to submit the PR by end of this week. Can you add me as contributor to this issue?

jafermarq commented 9 months ago

It's great to have you on board also @yining043 ! You are now a contributor.

adap / flower

FedPG-BR #2046

FedPG-BR

Do you want to work on this baseline?

1. Join the Summer of Reproducibility program

2. Define the scope of your contribution

What happens next?

Is something wrong or not clear ?