Thanks for your valuable comments.

We thank AC and all reviewers for their valuable time and constructive feedback. We briefly summarize the suggestions and our answers as the following. You can find detailed responses in individual threads. We have also revised the paper and highlighted the main changes with cyan.

1. Common concerns

Importance of FedKD (Yydy's Q2 and SprQ's Q24 & Q25)

Many prior works claim that Federated Learning with Knowledge Distillation (FedKD) has better practicality than the traditional FL, and application-oriented research has been expanding. Thus, it is crucial to notify the existence of the vulnerability of FedKD through this work.

Unclear problem settings (Yydy's Q5 and jNiz's Q14 & Q16)

The revised version has improved the problem settings and example figures.

Possible defense (Yydy's Q3 and jNiz's Q18)

As jNiz suggests, we have added the possible defense methods for future work in the Conclusion of the revised version.

2. Individual comments

Details about attack procedure (Q1, 11): PTBI reconstructs the private data by forwarding the optimal solution of Eq.7 to the inversion model, which takes the predictions of the server and local models and returns the original input data.
Connection between inversion attacks (Q4): We introduce a mechanism $M(\mathcal{A})$, which takes some information $\mathcal{A}$ and returns the private training dataset. $\mathcal{A}$ can be the gradient, parameters, or output logits of the trained model.
Novelyty of our formulation (Q8, 9): Eq.5 is a novel metric specifically designed for the model inversion attack against FedKD, and Eq.11 is mainly based on the prior works.
Difference with TBI (Q15): Our PTBI has Distinctiveness, which TBI lacks. Since the victim model of FedKD is trained on the two domains, the attacker should consider whether he is reconstructing the private or public data. Only PTBI solves this problem.
Comparison with FedKD that does not use the public dataset (Q19, 26): Although some schemes do not use the public dataset for knowledge distillation, these methods communicate gradient, which is not accessible in our setting. Model inversion attack with the gradient in FL has already been studied well, and it is not in our scope.
Impact of domain gap (Q23): We have conducted additional experiments and found that lower $\alpha$ become effective when the domain gap is smaller. We also noticed that too large or small domain gaps damage the attack success rate.
Experiment settings (Q12, 13, 20, 21, 22): The scale of our experiment setting is based on the related works, and we have clarified the loss function for each scheme in the revised version.
Missing citations and references (Q6, 7, 10, 17): We have fixed all the missing citations and references in the revised version, as the reviews suggest.

3. Typos and small updates

In addition, we have fixed some typos, missing citations, and misleading expressions.

Eq. 7 contained a typo, and we replaced $e^{J}$ with $e$, although the proof in the Appendix and the attached code have already used the correct notation. Thus, this mistake is only a typo and does not ruin the results of the experiments.
We have replaced $\mathcal{R}$ with $\mathbb{R}$
We have arranged the range of the y-axis of Fig.6
We have fixed the caption position of Tab.2 from the bottom to the top.
The revised version replaced' local dataset' with' private dataset.'
We use $D{pub}$ as the public dataset instead of $D{public}$ for simplicity.
We used SIM as a general notation for image similarity indices and chose SSIM as a specific example in section 3.2.3, but the revised version uses only SSIM for simplicity.
We have cited [1], another parameter-based inversion attack.

Reference

[1] Briland Hitaj, Giuseppe Ateniese, and Fernando Perez-Cruz. Deep models under the gan: information leakage from collaborative deep learning. In Proceedings of the 2017 ACM SIGSAC conference on computer and communications security, pages 603–618, 2017. [2] Daliang Li and Junpu Wang. Fedmd: Heterogenous federated learning via model distillation. arXiv preprint arXiv:1910.03581, 2019. [3] Sijie Cheng, Jingwen Wu, Yanghua Xiao, and Yang Liu. Fedgems: Federated learning of larger server models via selective knowledge fusion. arXiv preprint arXiv:2110.11027, 2021. [4] Hongsheng Hu, Zoran Salcic, Lichao Sun, Gillian Dobbie, Philip S. Yu, and Xuyun Zhang. Membership inference attacks on machine learning: A survey. ACM Comput. Surv., jan 2022. Just Accepted. [5] Nader Bouacida and Prasant Mohapatra. Vulnerabilities in federated learning. IEEE Access, 9:63229–63249, 2021. [6] Guodong Long, Tao Shen, Yue Tan, Leah Gerrard, Allison Clarke, and Jing Jiang. Federated learning for privacy-preserving open innovation future on digital health. In Humanity Driven AI, pages 113–133. Springer, 2022. [7] Sohei Itahara, Takayuki Nishio, Yusuke Koda, Masahiro Morikura, and Koji Yamamoto. Distillation-based semi-supervised federated learning for communication-efficient collaborative training with non-iid private data. IEEE Transactions on Mobile Computing, pages 1–1, 2021. [8] Lingjuan Lyu, Han Yu, Xingjun Ma, Lichao Sun, Jun Zhao, Qiang Yang, and Philip S Yu. Privacy and robustness in federated learning: Attacks and defenses. arXiv preprint arXiv:2012.06337, 2020. [9] Jonas Geiping, Hartmut Bauermeister, Hannah Dröge, and Michael Moeller. Inverting gradientshow easy is it to break privacy in federated learning? Advances in Neural Information Processing Systems, 33:16937–16947, 2020. [10] Bo Zhao, Konda Reddy Mopuri, and Hakan Bilen. idlg: Improved deep leakage from gradients. arXiv preprint arXiv:2001.02610, 2020. [11] Ligeng Zhu, Zhijian Liu, and Song Han. Deep leakage from gradients. Advances in Neural Information Processing Systems, 32, 2019. [12] Hongxu Yin, Arun Mallya, Arash Vahdat, Jose M Alvarez, Jan Kautz, and Pavlo Molchanov. See through gradients: Image batch recovery via gradinversion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16337–16346, 2021. [13] Zhibo Wang, Mengkai Song, Zhifei Zhang, Yang Song, Qian Wang, and Hairong Qi. Beyond inferring class representatives: User-level privacy leakage from federated learning. In IEEE INFOCOM 2019-IEEE Conference on Computer Communications, pages 2512–2520. IEEE, 2019. [14] Matt Fredrikson, Somesh Jha, and Thomas Ristenpart. Model inversion attacks that exploit confidence information and basic countermeasures. In Proceedings of the 22nd ACM SIGSAC conference on computer and communications security, pages 1322–1333, 2015. [15] Ziqi Yang, Jiyi Zhang, Ee-Chien Chang, and Zhenkai Liang. Neural network inversion in adversarial setting via background knowledge alignment. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, CCS ’19, page 225–240, New York, NY, USA, 2019. Association for Computing Machinery. [16] Guodong Long, Yue Tan, Jing Jiang, and Chengqi Zhang. Federated learning for open banking. In Federated learning, pages 240–254. Springer, 2020. [17] Shuai Zhao, Roshani Bharati, Cristian Borcea, and Yi Chen. Privacy-aware federated learning for page recommendation. In 2020 IEEE International Conference on Big Data (Big Data), pages 1071–1080. IEEE, 2020. [18] Dianbo Sui, Yubo Chen, Jun Zhao, Yantao Jia, Yuantao Xie, and Weijian Sun. Feded: Federated learning via ensemble distillation for medical relation extraction. In Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), pages 2118–2128, 2020. [19] Meirui Jiang, Hongzheng Yang, Chen Cheng, and Qi Dou. Iop-fl: Inside-outside personalization for federated medical image segmentation. arXiv preprint arXiv:2204.08467, 2022. [20] Yiqing Zhou, Jian Wang, and Zeru Wang. Bearing faulty prediction method based on federated transfer learning and knowledge distillation. Machines, 10(5), 2022. [21] Plamena Tsankovaand Galina Momcheva. Sentiment detection with fedmd: Federated learning via model distillation. Information Systems and Grid Technologies, 2020.

Reviewer Yydy

We thank you for your careful review. We have clarified the settings and explanations, highlighted the importance of FedKD, and added missing citations.

Q1: More details about the inversion model G (related to jNiz’s Q11)

Your understanding is almost correct. More specifically, when given the predictions of global and local models, the inversion model G returns the original data inputted to the global and local models. Since we can estimate the output logits of the global and local models on the private data (Eq.7), we can reconstruct the private data by forwarding the optimal logits to the inversion model (Eq.9 and Fig.1). The revised version includes these explanations in the first paragraph of 3.2 and 3.2.3.

Q2: Importance of federated learning with knowledge distillation (related to SprQ’s

Q24 and Q25)

Many related works state that Federated Learning with Knowledge Distillation (FedKD) is essential for its practical utility and robustness; hence, our contribution to breaching FedKD is significant. [2, 3, 4, 5, 6] point out that FedKD is necessary for realistic situations such as health care, finance, and AI as a service, where sharing model architecture is impossible due to the different computational power of each client. In addition, [3, 5, 7] show that communicating only the output logits needs less communication. [4, 5, 8] also suggests that FedKD is more secure than the traditional FL, which is already discussed in our paper. We have added a brief discussion and citations on these aspects in 2.1.

Q3: Defense methods (related to jNiz’s Q18)

As jNiz suggests, the revised version contains the possible defense methods for future work in Conclusion.

Q4: Connection between model inversion attack

We can view the model inversion attack as a mechanism $M(\mathcal{A})$, which takes some information $\mathcal{A}$ about the target model and returns the private training dataset. $\mathcal{A}$ can be the model’s output logits, gradients, or parameters in this sense. We have added this explanation in 2.2 of the revised version.

Q5: Unclear problem setting (related to jNiz’s Q14)

We state most notations about FedKD in 3.1. For example, we already defined { ju} as the set of labels missing from the public dataset within the second paragraph of 3.1 and Algo. 1. In addition, since any of our formulations do not use the distribution of public or private datasets, we do not need to specify them. In addition, we have improved the setting assumption in the revised version

Q6: Missing citations in the experiment

Although we cited all DS-FL, FedMD, and FedGEMS in 2.1, the revised version re-cites them in the experiment.

Q7: Missing citation of ’FedKD’

We use ’FedKD’ as the abbreviation of Federated Learning with Knowledge Distillation and explicitly declare it in the revised version.

Reviewer jNiz

We thank you for your helpful comments. We have solved your concerns and clarified the unclear sentences and settings as you suggested

Q8: Novelty of Eq.5

The formulation of Eq. 5 is the core idea of our method, and it is our novel contribution. We formulate Eq. 5 to represent the confidence gap on the private dataset between the local and server-side models, which is the characteristic situation of FedKD.

Q9: Novelty of Eq.11

We have slightly updated the explanation of Eq. 11 in the revised version, so it is more apparent that Eq. 11 is mainly based on the prior works. SSIM is a standard similarity metric for images, and TV is also a popular measure to quantify the noise of an image. Thus, we think the formulation of Eq.11 is natural but not very novel.

Q10: Explicit reference of pseudo-codes in the appendix

The revised version explicitly refers to the pseudo-codes.

Q11: Goal of 3.2.2 (related to Yydy’s Q1)

We have modified the first paragraph of 3.2 and clarified that the analytical solution obtained in 3.2.2 is used to create the best estimation of the private data with the inversion model.

Q12: Training task

As stated in the first paragraph of section 4, our task is image classification, where there is some gap between the public and private datasets.

Q13: Type of loss function

The classification loss is cross entropy for all schemes, and the distillation loss is L1-loss for FedMD, KL divergence for FedGEMS, and cross-entropy for DS-FL in their papers or codes. We have added the same explanation in the revised version.

Q14: The setting of the public dataset (related to Yydy’s Q5)

As stated in the second paragraph of 3.1, we assume that the public dataset consists of two domains (young/adult or masked/unmasked), and each private dataset consists of one domain (adult or unmasked). Although the public dataset has all data of the insensitive domain (young/masked), it does not contain some data from the sensitive domain (adult/masked). On the other hand, the private dataset includes these data missing from the public dataset. We have improved these problems setting in the revised version.

The example of our setting is as follows. If there are five identities, the public dataset has all young images of 5 identities, but it contains only the adult images of 3 identities. Conversely, the private dataset consists of the adult images of the remaining two identities. Then, the goal of the malicious server is to recover the adult images of 2 identities within the private dataset via the output logits of the public dataset (See Fig.2).

Your idea, ‘exploit in any way the young pictures of the labels for when the adult counterpart is not in the public dataset,‘ seems to mean taking advantage of the specific information of the counterpart public images with the target label, and it is the purpose of the second and third terms of Eq.5. These terms represents the confidence of the server-side model, which is trained on the public dataset when predicting the private data. In this sense, we can interpret these terms as the similarity between the private and public data from the viewpoint of the server-side model. In other words, α control the distance between the reconstructed data with the target label and the corresponding public data. For instance, if we choose a smaller α, the reconstructed image becomes closer to the public image.

Q15: Difference with TBI

As suggested in Tab. 1, the main difference is Distinctiveness, and TBI does not consider whether they are reconstructing private or public data. However, the target model is trained on private and public datasets in FedKD. Thus, our contribution to developing PTBI based on TBI is precisely related to FedKD since we design PTBI to deal with the specific situation of FedKD where the target model is trained on multiple domains. Our experimental results also confirm that TBI often reconstructs the public data, not the private one, due to the lack of Distinctiveness.

Q16: Justification of real-world FedKD setting

(the server knows the list of the labels) The server can obtain the list of the target labels by classifying the domain of each public data (young/adult or masked/unmasked in our settings). This classification is possible in many cases by hand-labeling or statistical approaches like supervised/unsupervised learning. We added this description in the revised version.
(no label overlap between centers) Given the sensitivity of the private data, it is reasonable to assume that each client dominants the private data with some labels.

Q17: What is $t$ in Figure 4?

$t$ represents the epoch, and we already declared it in 3.1. We also explicitly explain it in 4.2 in the revised version.

Q18: Possible defenses (related to Yydy’s Q3)

We added the differential privacy-based and homomorphic encryption-based approaches as the potential defense as potential future works in the Conclusion of the revised version.

Reviewer SprQ

We thank you for your exciting suggestions. We have clarified the background of our settings and conducted additional experiments as below.

Q19: The comparison with distillation methods that do not require public datasets (related to SprQ’s Q25)

Since these schemes communicate the gradient, they do not meet our assumption that the gradient is not accessible to the central server (See 3.1). Since our Prop.1 shows that gradient is more informative than the output logits, it is more reasonable for the attacker to use gradient, not the logits, if both are available. Many prior works have already researched the model inversion attack with gradients, as discussed in 2.2.2, and it is not in our scope.

Q20: The number of clients

We choose the number of clients given other works. Many prior works on inversion attacks against FL use only one client [1, 9, 10, 11, 12]. [13] also uses the client size of 10. In addition, FedMD [2], the representative method of FedKD, adopts a client size of 10. Thus, our setting is not insufficient compared to the related works, but we would like to conduct experiments with more clients if the time permits till the camera-ready version

Q21: The number of datasets

We decide the number of datasets based on the settings with other related works. Excluding the toy-dataset such as MNIST, [1, 12, 14] adopt one dataset, [9, 10] use two datasets, and [11, 15] utilize three datasets. Given these prior works, we think our setting, two realistic datasets, is sufficient to show the performance of our attack.

Q22: The number of communication rounds

Due to our limited computational power, we use relatively small communication rounds compared to the original methods [2, 3]. However, our goal is reconstructing the private data, not obtaining the models with high accuracy. Fig.S-2 in the Appendix shows that training of the inversion model almost converges with our communication rounds. We would like to add more communication rounds until the camera-ready version, but we expect the results will not change much.

Q23: Impact of the domain gap

This is an exciting idea, and we have conducted a quick experiment with the synthesized dataset. To control the domain gap, we adopt the AT&T dataset, which is used in many model inversion attacks [1, 14], and blur some images as the non-sensitive domain so that we can quantify the domain gap with the strength of blurring. The result shows that a smaller domain gap makes the best α lower, which is consistent with the intuition that the smaller gap leads to the high confidence of the server’s predictions. You can find the details in the revised version in 4.2 and Appendix.D.4. Although the scale of this experiment is limited due to the time limit, we would like to expand this aspect until the camera-ready version. Another finding is that too large and too small domain gaps make it hard to decouple two domains.

Q24: Feasibility of Federated Learning with knowledge distillation (related to Yydy’s Q2)

As you said, the need for a public dataset might limit the application possibility of FedKD. However, there are still many accessible public datasets, and application-oriented research is expanding rapidly [16, 17, 18, 19, 20, 21]. As discussed in Yydy’s Q2, many existing works also point out that FedKD is more suitable for real business problems than the traditional FL in model heterogeneity, communication efficiency, and robustness.

Q25: Poor performance of logits-based distillation (related to Yydy’s Q2)

[2, 3, 7] show that the accuracy of logits-based distillation is greater than the traditional FL such as FedAVG. In addition, as stated in Yydy’s Q2, FedKD requires less communication and achieves more security and model flexibility than traditional FL. Thus, we cannot say that the performance of logit-based distillation methods is poorer than the standard FL.

Q26: DeepInversion (related to SprQ’s Q19)

As discussed in 2.2, most prior works except TBI do not apply to FedKD due to the lack of access to gradient, parameters, or white-box API. For example, DeepInversion needs the gradients and white-box access to the teacher model, which corresponds to the local model in our setting, but any of them is not accessible in FedKD. In addition, we have already presented the attack performance of gradient-based leakage as our baseline and shown that sharing gradient is more vulnerable.

Koukyosyumei / RPDPKDFL

Reply #5