Closed lhq12 closed 7 months ago
Thanks for your attention to our work. Firstly, the performance drops slightly if we share only one view. However I have only tested a few settings. I think it would be interesting to do more experiments (only share shared_data1 v.s. shared_data1 and shared_data2 with the same noise strength). Secondly, we have added more strong noise to the second view (can be seen in VAE_std2). In this case, a malicious actor would find it easier to obtain information from share_data1 than share_data2. However, it might be interesting to recover the raw data in a diffusion manner if we have sufficient noised performance-sensitive features sequence.
Thanks for the quick reply.
Just now I found a mistake in the code: FL_VAE.py lines 169-170, which may influence the performance.
Yep, it's a mistake. It is the setting of shared_data1 and shared_data2 with the same noise strength.
I am closing the issue now. If you have any further questions, feel free to reach out :)
Could you share the hyper-parameter settings which leads to the results reported in the paper? I have tried once with the config.yaml and obtained the following results: "server_0_test_acc_epoch:85.43 & server_0_total_acc_epoch:85.32000404230781", which is much lower than the one you reported.
In CIFAR-10 with $\alpha=0.1, K=10, E=1$, you can set the hyper-parameters as follows:
VAE_adaptive=False
, VAE_std1=0.15
, VAE_std2=0.25
.
Thanks!
Is it the same setting for different datasets, \alpha, K and E?
VAE_std1
and VAE_std2
are almost the same.
Actually, It requires a lot of effort to select suitable VAE_re
, VAE_ce
, VAE_kl
and VAE_x_ce
under different datasets, $\alpha$, and clients numbers.
Hi!
I train resnet-18 on CIFAR-10 with alpha=0.1, K=10, E=1 using the hyperparameter you told me, but still failing to approach the reported results. I got the folowing results: FedFed + FedAvg: 88.92 FedProx: 87.82 SCAFFOLD: fail to converge
Could you help me to check if anything important missed?
Could you share the data distribution of all clients and your hardware configurations?
I recheck the default.py and modify the random seed to ensure the same data distribution across clients in paper. You can retry the experiments. Sometimes, different hardware also resulting different performance gains. If you have any more problems, feel free to reach me directly by email.
train_cls_local_counts_dict = {0: {2: 704, 3: 1, 4: 1219, 5: 236, 6: 1061, 8: 70, 9: 29}, 1: {0: 13, 1: 774, 3: 2725, 6: 91, 7: 295, 8: 1}, 2: {0: 127, 1: 22, 3: 3, 7: 200, 8: 14, 9: 1752}, 3: {0: 6, 3: 6, 4: 276, 5: 8, 7: 1, 8: 20, 9: 3219}, 4: {0: 826, 1: 291, 2: 102, 3: 10, 4: 3336, 5: 1142}, 5: {0: 2, 1: 2099, 2: 3266}, 6: {1: 16, 2: 543, 3: 154, 4: 123, 5: 74, 7: 4503}, 7: {0: 2722, 2: 159, 3: 2096, 6: 2560}, 8: {0: 1303, 1: 1797, 2: 64, 3: 4, 4: 45, 5: 3539}, 9: {0: 1, 1: 1, 2: 162, 3: 1, 4: 1, 5: 1, 6: 1288, 7: 1, 8: 4895}}
Linux-4.4.0-142-generic-x86_64-with-glibc2.17 GPU type GeForce RTX 3090 python=3.8.16=h7a1cb2a_2 pytorch=1.8.0=py3.8_cuda11.1_cudnn8.0.5_0
I notice that you log the test_accuracy after testing each batch in normal_trainer.test_on_server_for_round
, which may be inappropriate. So I only log the test_acc_avg.avg
once all samples are processed, as generally done. I don't know if the reported results are obtained in the former way.
You can try random seed 0 with the following data distribution: {0: {0: 976, 1: 1632, 2: 18, 3: 1120, 5: 132, 7: 636, 8: 3101}, 1: {1: 1, 5: 59, 7: 3, 8: 851, 9: 4228}, 2: {1: 814, 2: 212, 3: 3735, 5: 2737}, 3: {0: 46, 1: 291, 3: 4, 8: 401, 9: 123}, 4: {0: 125, 2: 1, 3: 137, 4: 1476, 6: 303, 8: 642, 9: 47}, 5: {0: 1923, 1: 381, 2: 3826}, 6: {1: 334, 2: 3, 3: 3, 4: 2477, 5: 2055, 7: 1543}, 7: {0: 6, 1: 1546, 2: 881, 7: 591, 8: 4, 9: 6}, 8: {0: 435, 2: 58, 4: 1046, 5: 16, 6: 4536}, 9: {0: 1489, 1: 1, 2: 1, 3: 1, 4: 1, 5: 1, 6: 161, 7: 2227, 8: 1, 9: 596}. And we report total ACC in our paper.
There may be something wrong with SCAFFOLD agorithm. The training loss is 'None'.
You can try a smaller learning rate like 0.001 and 0.0001
I will check SCAFFOLD.
Thanks.
Thanks for the awesome work. I find in the code that you actually share two views of the same data (i.e. share_data1, share_data2). My question is : 1) To my understanding, sharing multiple views of the same data can potentially improve model performance by providing complementary information, will the performance drop if only one view is shared? 2) combining them can provide a more comprehensive understanding of the underlying data distribution, will this lead to more privacy leakage?