MMorafah / FLIS

FLIS: Clustered Federated Learning via Inference Similarity for Non-IID Data Distribution
MIT License
34 stars 8 forks source link

Some Questions #3

Closed yyEudora closed 1 year ago

yyEudora commented 1 year ago

Dr Mahdi Morafah: Hello! By reproducing the code of your paper, I have some questions to ask:

  1. I see that the structure of simple-cnn and lenet5 in your code is exactly the same, does it mean that the model you use when the parameter in the flis_dc.sh file is set to "simple-cnn" is set to lenet5?
  2. Will using a simple CNN model for federal training lead to overfitting?
  3. When I ran the code you published, I found that the average training accuracy improved from 81.45 to 90.86 when I compared the results of 24 and 50 rounds of training, but the average testing accuracy was only 48.42 and 54.07, which is not as big as the training data set, does it mean that the lenet5 network will be overfitted? How to set the parameters to achieve the training and testing accuracy mentioned in your paper?
  4. Is local test data reused in multiple clients? For example, 10,000 test samples (1,000 per label), the first client has labels 1, 3, and 5, and the second client has labels 2, 3, and 4. Then the data for each client is 3,000. And are the 1000 test samples of label 3 common to both clients?
MMorafah commented 1 year ago

Hello,

Thanks for checking out our repository. Here are the answers to your questions:

  1. Yes, “simple-cnn” is the same as lenet5. I think we set to “simple-cnn"
  2. It can get over-fitted if the hyper-parameters set poorly.
  3. Please consider the different between global accuracy which is testing on the whole test dataset and personalized (local) accuracy which is testing on the sub-part of test dataset that the local client has. Maybe you are testing using two different metrics and the gap is coming from that. But, to set the hyper-paramters mentioned in the paper you need to put it in the .sh argument file.

Regards, Mahdi On Mar 28, 2023 at 5:37 AM -0700, yyEudora @.***>, wrote:

Dr Mahdi Morafah: Hello! By reproducing the code of your paper, I have some questions to ask:

  1. I see that the structure of simple-cnn and lenet5 in your code is exactly the same, does it mean that the model you use when the parameter in the flis_dc.sh file is set to "simple-cnn" is set to lenet5?
  2. Will using a simple CNN model for federal training lead to overfitting?
  3. When I ran the code you published, I found that the average training accuracy improved from 81.45 to 90.86 when I compared the results of 24 and 50 rounds of training, but the average testing accuracy was only 48.42 and 54.07, which is not as big as the training data set, does it mean that the lenet5 network will be overfitted? How to set the parameters to achieve the training and testing accuracy mentioned in your paper?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.***>

yyEudora commented 1 year ago

Dr. Mahdi Morafah: Thank you for your enthusiastic reply! I still have some doubts about overfitting, which hyperparameter settings can avoid simple-cnn overfitting? After 100 rounds of training, the client model has an average accuracy of 90% on its local training set, but only 60% on the local test set (although it has reached the data listed in your article non-iiddir (0.1)), is this a kind of overfitting? And about the last small question mentioned last time, I am still looking forward to your answer.

I look forward to your answer!

Hello, Thanks for checking out our repository. Here are the answers to your questions: 1. Yes, “simple-cnn” is the same as lenet5. I think we set to “simple-cnn" 2. It can get over-fitted if the hyper-parameters set poorly. 3. Please consider the different between global accuracy which is testing on the whole test dataset and personalized (local) accuracy which is testing on the sub-part of test dataset that the local client has. Maybe you are testing using two different metrics and the gap is coming from that. But, to set the hyper-paramters mentioned in the paper you need to put it in the .sh argument file. Regards, Mahdi On Mar 28, 2023 at 5:37 AM -0700, yyEudora @.>, wrote: Dr Mahdi Morafah: Hello! By reproducing the code of your paper, I have some questions to ask: 1. I see that the structure of simple-cnn and lenet5 in your code is exactly the same, does it mean that the model you use when the parameter in the flis_dc.sh file is set to "simple-cnn" is set to lenet5? 2. Will using a simple CNN model for federal training lead to overfitting? 3. When I ran the code you published, I found that the average training accuracy improved from 81.45 to 90.86 when I compared the results of 24 and 50 rounds of training, but the average testing accuracy was only 48.42 and 54.07, which is not as big as the training data set, does it mean that the lenet5 network will be overfitted? How to set the parameters to achieve the training and testing accuracy mentioned in your paper? — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.>

MMorafah commented 1 year ago

Hello,

To find the best hyper-parameters, one should run experimentations with different learning rates, and possibly learning rate schedulers, adding some local regularizers to avoid overfitting. I think this the kind of what you can get, because the client data is small and overfitting can happen. To close up the gap, the client may need more data which is not possible in the FL setting.

Regards, Mahdi On Mar 30, 2023 at 10:23 PM -0700, yyEudora @.***>, wrote:

Dr. Mahdi Morafah: Thank you for your enthusiastic reply! I still have some doubts about overfitting, which hyperparameter settings can avoid simple-cnn overfitting? After 100 rounds of training, the client model has an average accuracy of 90% on its local training set, but only 60% on the local test set (although it has reached the data listed in your article non-iiddir (0.1)), is this a kind of overfitting? And about the last small question mentioned last time, I am still looking forward to your answer. I look forward to your answer!

Hello, Thanks for checking out our repository. Here are the answers to your questions: 1. Yes, “simple-cnn” is the same as lenet5. I think we set to “simple-cnn" 2. It can get over-fitted if the hyper-parameters set poorly. 3. Please consider the different between global accuracy which is testing on the whole test dataset and personalized (local) accuracy which is testing on the sub-part of test dataset that the local client has. Maybe you are testing using two different metrics and the gap is coming from that. But, to set the hyper-paramters mentioned in the paper you need to put it in the .sh argument file. Regards, Mahdi … On Mar 28, 2023 at 5:37 AM -0700, yyEudora @.>, wrote: Dr Mahdi Morafah: Hello! By reproducing the code of your paper, I have some questions to ask: 1. I see that the structure of simple-cnn and lenet5 in your code is exactly the same, does it mean that the model you use when the parameter in the flis_dc.sh file is set to "simple-cnn" is set to lenet5? 2. Will using a simple CNN model for federal training lead to overfitting? 3. When I ran the code you published, I found that the average training accuracy improved from 81.45 to 90.86 when I compared the results of 24 and 50 rounds of training, but the average testing accuracy was only 48.42 and 54.07, which is not as big as the training data set, does it mean that the lenet5 network will be overfitted? How to set the parameters to achieve the training and testing accuracy mentioned in your paper? — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.> — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

superxtt commented 1 year ago

Hello, nsample-pc for setting cifar10 and fmlist for noniid # label2 are all 250, right?In nonid # label3, what are their nsamples_ pc? How to set up the nsample-pc by svhn dataset in two situations。

MMorafah commented 1 year ago

Hello,

I think nsample-pc does not being used in the code and partitioning. just passing non-iid #label3 will itself generate the partitions.

Regards, Mahdi On May 17, 2023 at 5:14 PM -0700, superxtt @.***>, wrote:

Hello, nsample-pc for setting cifar10 and fmlist for noniid # label2 are all 250, right?In nonid # label3, what are their nsamples_ pc? How to set up the nsample-pc by svhn dataset in two situations。 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

rajeshchaudhary121 commented 1 year ago

MODEL: lenet5, Dataset: cifar10 not supported yet

I am getting this upon execution... can you please help?

rajeshchaudhary121 commented 1 year ago

how to execute this code on google colab?

please provide the procedure as i am unable to execute the code

superxtt commented 1 year ago

Dr Mahdi Morafah: Hello!In the paper, the four data sets(cifar10,fmnist,cifar100,svhn) are declustered based on cluster_alpha using the FLIS (DC) algorithm. In table 1,Test Accuracy Comparison Across Different Datasets for Non-IID Label Skew (20%), and (30%). The cluster_alpha set for all four datasets is 0.5.?Or set the cluster_alpha differently for different datasets.
Because it was observed experimentally that almost all similarity matrix values are below 0.5 in the CIFAR100 dataset.Thank you!

MMorafah commented 1 year ago

Hello,

I don’t remember exactly what I set for the cluster_alpha, but I think 0.5 or 0.4 was working well for the datasets.

Regards, Mahdi On Jun 6, 2023 at 1:10 AM -0700, superxtt @.***>, wrote:

Dr Mahdi Morafah: Hello!In the paper, the four data sets(cifar10,fmnist,cifar100,svhn) are declustered based on cluster_alpha using the FLIS (DC) algorithm. In table 1,Test Accuracy Comparison Across Different Datasets for Non-IID Label Skew (20%), and (30%). The cluster_alpha set for all four datasets is 0.5.?Or set the cluster_alpha differently for different datasets. Because it was observed experimentally that almost all similarity matrix values are below 0.5 in the CIFAR100 dataset.Thank you! — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

superxtt commented 1 year ago

Dr Mahdi Morafah: Hello!We know the server captures the inference results on its own small dataset. Then according to the similarity of the inference results, the clients are clustered.Does the number of server datasets affect inference results?

MMorafah commented 1 year ago

Hello,

Your understanding is correct. This is a good question. We did not explore the effect of number of public dataset on computing the similarity. This is a good area to explore and conduct experimentations. But, I think the larger the public dataset the better (more robust) the similarity scores.

Regards, Mahdi On Jun 29, 2023 at 6:00 AM -0700, superxtt @.***>, wrote:

Dr Mahdi Morafah: Hello!We know the server captures the inference results on its own small dataset. Then according to the similarity of the inference results, the clients are clustered.Does the number of server datasets affect inference results? — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you modified the open/close state.Message ID: @.***>