Open jafermarq opened 1 year ago
Hi @jafermarq I'm Aml from Egypt -an eligible country:) -, a fresh computer engineering graduate, I've already played around with federated learning and flower during my work on my graduation project -Fraud detection System using federated learning- Here's my LinkedIn
I'm excited to be working on this paper, I'll be working on reproducing the results from Tables 3 & 4 (p.6), The experiments will involve comparing the performance of FedSM & FedSM-extra against FedAvg, FedProx and Scaffold as the baseline methods. I think that FLOWER doesn't have Scaffold or FedProx in its API -please correct me on that if I was wrong- so should I work on adding them? I chose this particular paper as its datasets are quite small so that will suit my computational power, also it got an implementation in here, so no missing details can be a problem, -I really hope that the ready code isn't a problem for you as I understood from the provided links, you're looking to produce this paper results using FLOWER API-
Hi @Aml-Hassan-Abd-El-hamid , it's great to see you want to work on FedSM
. Your plan of reproducing the results in Tables 3 and 4 make sense. As for the baselines, I'd advice to first use FedAvg
since it appears to be working the best compared to FedProx
(there is an implementation of its original paper here) and Scaffold
. I would say adding FedProx
shouldn't be too hard but, given the time until the end of Summer of Reproducibility (end September), how about leaving it as an extension?
One small comment, could you confirm you have access to the compute resources needed to run the experiments? I see image segmentation in this work is done with a U-Net
with 256x256
images. I suspect a decent GPU(s) will be needed.
Hi @jafermarq
Thank you very much for your response.
I agree with you on the FedProx
and Scaffold
part.
Regarding the computational power, I plan on using Kaggle's GPU, Google Colab's and PaperSpace, the three of them offer free GPUs , I think giving the small size of the data used that this could work even with the deep networks in picture, -please correct me on this part if you think it's a bad idea-
Hi @Aml-Hassan-Abd-El-hamid , indeed the datasets are small so it could be sufficient one of the free GPUs you mention. I'm only familiar with Colab (but for simple, single script projects). For the Summer of Reproducibility you'll need to work with several files as well as define your own Python environment (you can see an example of a baseline here: https://github.com/adap/flower/tree/main/baselines/fedprox). You are probably more familiar than me with Kaggle, Colab, PaperSpace so, do you think you could develop a project like this using those platforms?
One idea would be to first attempt running the implementation you found and see if the GPU resources in the Kaggle/Colab/PaperSpace can run it.
Hey @jafermarq I'll work on the baseline from this issue #2220 so this one is now available for anyone who is interested in it :)
Hi @jafermarq I can take up this baseline. Given the short amount of time left for SoR, I will focus on reproducing the results of FedSM and FedSM-extra in Tables 3 and 4. (I'm a first-year PhD student in Singapore with sufficient resource access for this baseline)
@duynht, could you please detail what would be your contribution plan (i.e. what experiments you would like to reproduce)?
Hi @jafermarq, I'm reproducing FedSM & FedSM-extra experiments on retinal disc segmentation, retinal cup segmentation, and prostate segmentation with 2D U-Nets and VGG-11 on 256x256 images.
Hi @duynht , this sounds good but could you please point me to which figures/tables are you reproducing exactly? Is it Figure 2 or also part of the tables 3 and 4? Also, you need to decide and let me know which baseline(s) in the paper you are implementing so your FedSM
and FedSM-extra
results are put into context. You could choose either FedAvg, FedProx or Scaffold (which are baselines in fig 2 and tables3&4). Having the centralised baseline would also be good but it's up to you.
The result is part of Tables 3&4, @jafermarq. For now, I think let's start with comparing FedSM and FedSM-extra with the centralized baseline.
Hi @duynht , please be a bit more specific about what results you are aiming to reproduce. Which rows and columns in Table 3&4? You can see as examples the discussion for this other baselines: FedPer, FedAvgM, FedMLB. All contributions require ( as per the rules of this innitiative -- see the first point in our FAQ: flower.dev/summer/#faq)the implementation of another federated learning baseline presented in the paper (e.g. FedAvg).
It is important that your contribution plan is clear in order for us to evaluate it at the end of the Summer of Reproducibility program.
Hi @jafermarq. Sorry for the confusing communication. I thought your previous comment mentioned that the Centralized
baseline was acceptable to put FedSM
and FedSM-extra
in context. Hence, I intended to reproduce the Centralized
, FedSM
, and FedSM-extra
rows in Tables 3 and 4.
Reading your comment again, however, I realized that the Centralized
baseline was not among the options but rather a nice-to-have. In that case, I will reproduce FedAvg
, FedSM
, and FedSM-extra
rows in Tables 3 and 4.
Thank you.
Great! Thanks for the details @duynht. That's all that's really needed so you can officially get FedSM
assigned to you.
I have ✅ all points in Step 1 & 2 above, made you the assignee of this baseline, and moved it to "In Progress" status. You'll find all the info on how to start with the code by following the link in the What happens next? section in the Issue description (above). Please remember that the Flower Summer of Reproducibility ends at the end of September so all baselines need to be ready by then. If you have any doubts or questions about the Flower API, feel free to reach out to me or any of the other contributors via our Slack workspace. Keep and eye in the #summer-of-reproducibility
channel as we'll be making some announcements soon.
Looking forward to seeing your FedSM
and FedSM-extra
implementations in action! 🚀
Hi @duynht,
This is just a gentle reminder that the Flower Summer of Reproducibility is ending at the end of the month. With just a little more than 3 weeks to go, we are excited to see quite a few baselines well ahead in the process with their respective PRs close to ready. If your PR is already on the list, great !! Please make sure the PR is linked to this issue (you just need to copy the URL of this issue somewhere in the main message of your PR). Ping me when you'd like me to take a look.
Also, make sure you keep an eye:eyes: on the #summer-of-reproducibility channel in the Flower Slack. I’ll announce very soon a new (the third!) round of 1:1 ask-me-anything sessions to help Summer of Reproducibility contributors like yourself to meet the deadline. Please consider booking a time slot if you want to chat with me about your baseline, potential issues you have making your code run, how to open a PR, doubts about what to include in your readme, how to use Hydra configs more effective, etc … all questions are welcome!!
FedSM
Do you want to work on this baseline?
What follows are the steps 1 & 2 in the Summer of Reproducibility instructions.
1. Join the Summer of Reproducibility program
#summer-of-reproducibility
.2. Define the scope of your contribution
[x] Check if you are eligible for a reward.
If where you are based is not on the list, please send us an email (
summer@flower.dev
) letting us know a bit about yourself (where are you currently based?, are you a university student? do you work at a public institution?). Please tell us the baselines you are interested in implementing (i.e. tell us your GitHub issue if you have crated one). We will reach back to you.What happens next?
[x] This item will be moved to the
In Progress
stage by a member of the Flower Team.[ ] Follow the instructions for creating a new baseline which will guide you through the process step-by-step.
Is something wrong or not clear ?