jafermarq commented 1 year ago

FedMix

Title: FedMix: Approximation of Mixup under Mean Augmented Federated Learning
Venue: ICLR 2021
Link to paper: https://openreview.net/forum?id=Ogga20D2HO-

Do you want to work on this baseline?

🌻 Check everything about the Summer of Reproducibility on flower.dev/summer

All available baselines are listed in the Summer of Reproducibility Dashboard and also in the GitHub Issues with the summer-of-reproducibility label. The content is the same.

📝 It is advised to complete these steps before your start working on your code. But if you can't wait to implement your baseline with Flower (we totally understand it 😄), please ensure you follow the steps on how to contribute a new baseline.

What follows are the steps 1 & 2 in the Summer of Reproducibility instructions.

1. Join the Summer of Reproducibility program

[x] Join the Flower Slack and say "hi! 👋" in the channel #summer-of-reproducibility.
[x] Pick a baseline from our curated list <---------------------------------------- [you are doing this now]
2. Define the scope of your contribution
[x] What are you going to reproduce? Add a comment to your issue and tell us about your plan regarding this baseline: what experiments from the paper are you reproducing?, for which datasets ? the more details you provide us with the better !
[x] Check if you are eligible for a reward.

As we have to comply with US/EU regulations, we have checked that individual contributors based on these countries or territories are eligible: Australia, Austria, Belgium, Bulgaria, Canada, Croatia, Cyprus, Czech Republic, Denmark, Estonia, Finland, France, Germany, Gibraltar, Greece, Hong Kong SAR China, Hungary, India, Ireland, Italy, Japan, Latvia, Liechtenstein, Lithuania, Luxembourg, Malta, Mexico, Netherlands, New Zealand, Norway, Poland, Portugal, Romania, Singapore, Slovakia, Slovenia, South Korea, Spain, Sweden, Switzerland, Thailand, United Arab emirates, United Kingdom United States.

If where you are based is not on the list, please send us an email (summer@flower.dev) letting us know a bit about yourself (where are you currently based?, are you a university student? do you work at a public institution?). Please tell us the baselines you are interested in implementing (i.e. tell us your GitHub issue if you have crated one). We will reach back to you.
[x] We will discuss with you about your contribution plan, if it sounds like a substantial enough contribution according to the Summer of Reproducibility rules (check our website flower.dev/summer), you'll get the OK from us to start working on your baseline!

What happens next?

[x] This item will be moved to the In Progress stage by a member of the Flower Team.
[ ] Follow the instructions for creating a new baseline which will guide you through the process step-by-step.

Is something wrong or not clear ?

Ask a question directly in your issue.
Reach out to us via the Flower Slack and ask your question in the #summer-of-reproducibility channel
Check all the details (including FAQ) in the Summer of Reproducibility website: flower.dev/summer

DevPranjal commented 11 months ago

Hi @jafermarq! Thanks for this sprint. Since this baseline is still up for grabs, I would love to work on reproducing it.

Contribution Plan

Implementing the MAFL framework and improving it further using the FedMix approximation
Reproduce experiments involving FEMNIST and CIFAR10 data (Table 1 and Figure 2)
Compare against FedAvg and FedProx and other mixup scenarios mentioned

My Background

I am a final year undergrad at IIT Roorkee, India and have been following ML Security research
Participated and placed 2nd in the Microsoft Membership Inference Competion (Vision Track) @ IEEE SatML
Participated and placed 2nd (India Regionals) in CSAW '22

Excited and looking forward to working on this!

Tahani1991 commented 11 months ago

Hello, I am Tahani , I am interested in reproducing this baseline (FedMix). To be more specific, I plan to apply (FedMix and compare it against FedAvg and FedProx). In terms of the experiment I want to reproduce and the datasets: I aim to use these two datasets (FMNIST and CIFAR10) to apply the experiment in Table 3 (Test accuracy on CIFAR10, under varying numbers of clients (N), while the number of samples per client is kept constant.). I will focus on (Global Mixup, FedAvg, FedMix). Then, I will move to Table 5 ( Test accuracy on CIFAR10, under varying mixup ratio λ.) where λ is The variable λ ∈ [0, 1] is a hyperparameter chosen from the beta distribution for each training step. Finally, Figure 3, I will apply the FedMix with various Mk values, and samples of averaged images from EMNIST/CIFAR10.

About me: I am a third-year PhD student from the University of Glasgow, and my research is focused on distributed machine learning and federated learning. I also have an accepted paper.

I have the following accepted paper :

Aladwani, T., Anagnostopoulos, C. , Kolomvatsos, K., Alghamdi, I. and Deligianni, F. (2023) Query-driven Edge Node Selection in Distributed Learning Environments. In: Data-driven Smart Cities (DASC 2023)/ 39th IEEE International Conference on Data Engineering (ICDE 2023), Anaheim, CA, United States, 3-7 April 2023.
Anagnostopoulos, C. , Aladwani, T., Alghamdi, I. and Kolomvatsos, K. (2022) Data-driven analytics task management reasoning mechanism in edge computing. Smart Cities, 5(2), pp. 562-582. (doi: 10.3390/smartcities5020030)
Aladwani, T., Alghamdi, I., Kolomvatsos, K. and Anagnostopoulos, C. (2022) Data-Driven Analytics Task Management at the Edge: A Fuzzy Reasoning Approach. In: The 9th International Conference on Future Internet of Things and Cloud (FiCloud 2022), Rome, Italy, 22-24 August 2022.

Academic Activities

Talk about my work in IDSAI 2023, University of Manchester, 13-6-2023.
Poster Representation, 19-5-2023, Alan Turing Network for AI in Geotechnics, University of Glasgow
Attendance of the Flower Summit 2023, University of Cambridge, 30-31-5- 2023.
Attendance of AI Summit 2023, Riyadh, 13-15-9-2022.
Reviewer of papers for Euro-Par 2023 conference.

Kind regards, Tahani

jafermarq commented 11 months ago

Hi @DevPranjal, it's good to see you are interested in implementing FedMix. Before assigning this baseline to you i just want to discuss briefly a topic i'm raising with all recent contributors: are you certain you have access to the necessary compute resources (e.g. a couple of GPUs -- I'm inclined to say free cloud options might be unsuitable for this baseline) to run the experiments you propose? If you are part of a university group, probably you have some dedicated GPUs you can use? Also please note the comment in Appendix G:

While in the memory aspect, FedMix requires about twice more GPU memory allocation compared to FedAvg, this phenomenon is also observed on LocalMix and NaiveMix.

Please let me know if you are confident you have resources to reproduce FedMix. It would be great having you on board as a contributor to the Flower Summer of Reproducibility!!

Sorry @Tahani1991, but since Pranjal added a message before you, I'll give preference to this contribution plan first.

DevPranjal commented 11 months ago

Thanks for the reply @jafermarq. I do have enough compute (Nvidia DGX systems) provided by our college to reproduce the experiments mentioned in the paper. In the past, I have also used these to reproduce other, more compute heavy papers. Hence, I don't think resources should be an issue.

jafermarq commented 11 months ago

Hi @DevPranjal, wow DGX 🤩 !! Ok, then you are all set. I have now ✅ all points in Step 1 & 2 above, added you as the assignee to this issue and moved this baseline to In Progress status. You can find a guide on how to start with the code by following the link in the What happens next? section above in the issue description. Essentially you'll see that we have put together a templated directory with a fixed structure that we hope all contributors will follow. Some contributors have already opened draft PRs with their implementation. Given that the Summer of Reproducibility runs only until the end of September, my advice to you is to start as soon as you can and ask questions or doubts either directly to me via Slack or in the #questions channel or in the #summer-of-reproducibility channel so you can reach out to other contributors.

Looking forward to seeing your FedMix implementation in action using Flower !!

jafermarq commented 10 months ago

Hi @DevPranjal,

This is just a gentle reminder that the Flower Summer of Reproducibility is ending at the end of the month. With just a little more than 3 weeks to go, we are excited to see quite a few baselines well ahead in the process with their respective PRs close to ready. I see you created a PR some weeks ago, great !! Ping me when you'd like me to take a look.

Also, make sure you keep an eye:eyes: on the #summer-of-reproducibility channel in the Flower Slack. I’ll announce very soon a new (the third!) round of 1:1 ask-me-anything sessions to help Summer of Reproducibility contributors like yourself to meet the deadline. Please consider booking a time slot if you want to chat with me about your baseline, potential issues you have making your code run, how to open a PR, doubts about what to include in your readme, how to use Hydra configs more effective, etc … all questions are welcome!!

adap / flower

FedMix #2051