jafermarq commented 1 year ago

DASAH

Title: DASHA: Distributed Nonconvex Optimization with Communication Compression and Optimal Oracle Complexity
Venue: ICLR 2023
Link to paper: https://openreview.net/forum?id=VA1YpcNr7ul

Do you want to work on this baseline?

🌻 Check everything about the Summer of Reproducibility on flower.dev/summer

All available baselines are listed in the Summer of Reproducibility Dashboard and also in the GitHub Issues with the summer-of-reproducibility label. The content is the same.

📝 It is advised to complete these steps before your start working on your code. But if you can't wait to implement your baseline with Flower (we totally understand it 😄), please ensure you follow the steps on how to contribute a new baseline.

What follows are the steps 1 & 2 in the Summer of Reproducibility instructions.

1. Join the Summer of Reproducibility program

[x] Join the Flower Slack and say "hi! 👋" in the channel #summer-of-reproducibility.
[x] Pick a baseline from our curated list <---------------------------------------- [you are doing this now]
2. Define the scope of your contribution
[x] What are you going to reproduce? Add a comment to your issue and tell us about your plan regarding this baseline: what experiments from the paper are you reproducing?, for which datasets ? the more details you provide us with the better !
[x] Check if you are eligible for a reward.

As we have to comply with US/EU regulations, we have checked that individual contributors based on these countries or territories are eligible: Australia, Austria, Belgium, Bulgaria, Canada, Croatia, Cyprus, Czech Republic, Denmark, Estonia, Finland, France, Germany, Gibraltar, Greece, Hong Kong SAR China, Hungary, India, Ireland, Italy, Japan, Latvia, Liechtenstein, Lithuania, Luxembourg, Malta, Mexico, Netherlands, New Zealand, Norway, Poland, Portugal, Romania, Singapore, Slovakia, Slovenia, South Korea, Spain, Sweden, Switzerland, Thailand, United Arab emirates, United Kingdom United States.

If where you are based is not on the list, please send us an email (summer@flower.dev) letting us know a bit about yourself (where are you currently based?, are you a university student? do you work at a public institution?). Please tell us the baselines you are interested in implementing (i.e. tell us your GitHub issue if you have crated one). We will reach back to you.
[x] We will discuss with you about your contribution plan, if it sounds like a substantial enough contribution according to the Summer of Reproducibility rules (check our website flower.dev/summer), you'll get the OK from us to start working on your baseline!

What happens next?

[x] This item will be moved to the In Progress stage by a member of the Flower Team.
[x] Follow the instructions for creating a new baseline which will guide you through the process step-by-step.

Is something wrong or not clear ?

Ask a question directly in your issue.
Reach out to us via the Flower Slack and ask your question in the #summer-of-reproducibility channel
Check all the details (including FAQ) in the Summer of Reproducibility website: flower.dev/summer

k3nfalt commented 1 year ago

Hi! My name is Alexander Tyurin. I'm planning to work on this project.

What are you going to reproduce?

I'm planning to reproduce the experiment from Section A.1 (See Figure 1) from the paper. In Figure 1, the authors plot how the norm of gradients changes with the number of sent bits. My goal will be to reproduce this behavior. Also, I can implement the MARINA algorithm (a paper's baseline) from Figure 1 and ensure that I can capture the relation between DASHA and MARINA in plots. Here the authors consider the classification problem with the mushrooms dataset from LibSVM.

In Sections A.2, A.3, and A.4, the authors consider the stochastic settings where the nodes/workers/clients calculate stochastic gradients instead of exact gradients. It can be the next step that I will reproduce.

jafermarq commented 1 year ago

Hey @k3nfalt , that sounds like a good start! How about before jumping into A.2 and A.3 , the next step involves the CIFAR-10 setting in A.4? What would it take to then extend it to a setting that's more commonly found in other works in terms of number of clients and data partitioning methodology? I understand that DASHA operates with full client participation, so maybe taking a look into other works doing cross-silo FL for image classification is a reasonable place to look for these settings.

k3nfalt commented 1 year ago

Hi @jafermarq, I'm finishing reproducing the plots from the DASHA paper. In these experiments, I am reproducing A.1 and A.4 from the paper. I will create a pull request with my code.

k3nfalt commented 1 year ago

The pull request is here: https://github.com/adap/flower/pull/2230

k3nfalt commented 1 year ago

Hi @jafermarq, I've just reproduced A.1 and A.4 experiments from the paper. The PR is here: https://github.com/adap/flower/pull/2230. What are the next steps?

jafermarq commented 1 year ago

Congratulations @k3nfalt ! Your DASHA baseline has been merged. We'll follow up with you soon to complete your Summer of Reproducibility journey 😄 ! Many thanks for participating!

k3nfalt commented 1 year ago

@jafermarq, Thank you!

adap / flower

DASHA #2057

DASAH

Do you want to work on this baseline?

1. Join the Summer of Reproducibility program

2. Define the scope of your contribution

What happens next?

Is something wrong or not clear ?