xinleihe / GNNStealing

31 stars 4 forks source link

Hyperparameters #3

Open xujing1994 opened 9 months ago

xujing1994 commented 9 months ago

Hi, it's an interesting work. Thanks for sharing the code.

I am following the code to train the target model but I find the testing accuracy of each model and dataset is always lower than the performance in Table 4.

Could you please share the hyperparameters of training the target models? Thanks in advance!

hhdjl commented 9 months ago

Sorry to bother you, but I'd like to ask why these two issues are occurring,I would appreciate it if you could answer it. No.1

File "C:\Users\User\anaconda3\envs\gnn_model_stealing\lib\site-packages\torch\utils\data\dataloader.py", line 1004, in _try_get_data raise RuntimeError('DataLoader worker (pid(s) {}) exited unexpectedly'.format(pids_str)) from e RuntimeError: DataLoader worker (pid(s) 26324, 3628, 32496, 24296, 32456, 23288, 31972, 22952) exited unexpectedly

No.2 File "C:\Users\User\anaconda3\envs\gnn_model_stealing\lib\multiprocessing\spawn.py", line 135, in _check_not_importing_main raise RuntimeError(''' RuntimeError: An attempt has been made to start a new process before the current process has finished its bootstrapping phase.

    This probably means that you are not using fork to start your
    child processes and you have forgotten to use the proper idiom
    in the main module:

        if __name__ == '__main__':
            freeze_support()
            ...

    The "freeze_support()" line can be omitted if the program
    is not going to be frozen to produce an executable.
xinleihe commented 9 months ago

Hi, it's an interesting work. Thanks for sharing the code.

I am following the code to train the target model but I find the testing accuracy of each model and dataset is always lower than the performance in Table 4.

Could you please share the hyperparameters of training the target models? Thanks in advance!

Hi, the test accuracy is calculated using the attack testing dataset, i.e., when you run attack.py, it should summarise the target test accuracy, attack accuracy, and fidelity.

xujing1994 commented 9 months ago

Hi, it's an interesting work. Thanks for sharing the code. I am following the code to train the target model but I find the testing accuracy of each model and dataset is always lower than the performance in Table 4. Could you please share the hyperparameters of training the target models? Thanks in advance!

Hi, the test accuracy is calculated using the attack testing dataset, i.e., when you run attack.py, it should summarise the target test accuracy, attack accuracy, and fidelity.

Hi, thanks for your reply.

However, if I understand correctly. the target test accuracy is calculated with the trained target model, which is saved after running train_target_model.py. The point is that after I trained and saved the target mode by running train_target_model.py, the target test accuracy was always lower than that in Table 4.

Thus, I am wondering if you could kindly share the hyperparameters of training the target models? Or if it's more convenient for you, could you share with me one pretrained target model? Thanks a lot.

xinleihe commented 9 months ago

Hi, it's an interesting work. Thanks for sharing the code. I am following the code to train the target model but I find the testing accuracy of each model and dataset is always lower than the performance in Table 4. Could you please share the hyperparameters of training the target models? Thanks in advance!

Hi, the test accuracy is calculated using the attack testing dataset, i.e., when you run attack.py, it should summarise the target test accuracy, attack accuracy, and fidelity.

Hi, thanks for your reply.

However, if I understand correctly. the target test accuracy is calculated with the trained target model, which is saved after running train_target_model.py. The point is that after I trained and saved the target mode by running train_target_model.py, the target test accuracy was always lower than that in Table 4.

Thus, I am wondering if you could kindly share the hyperparameters of training the target models? Or if it's more convenient for you, could you share with me one pretrained target model? Thanks a lot.

Yes, the target model is trained and saved using the train_target_model.py (and the hyperparameters are the same as this file).

After the target model is trained, we perform the attack and use the same set of data (attack testing dataset) to test the performance of target model and the surrogate model. The performance in Table 4 is calculated in this way.

xujing1994 commented 9 months ago

Hi, it's an interesting work. Thanks for sharing the code. I am following the code to train the target model but I find the testing accuracy of each model and dataset is always lower than the performance in Table 4. Could you please share the hyperparameters of training the target models? Thanks in advance!

Hi, the test accuracy is calculated using the attack testing dataset, i.e., when you run attack.py, it should summarise the target test accuracy, attack accuracy, and fidelity.

Hi, thanks for your reply. However, if I understand correctly. the target test accuracy is calculated with the trained target model, which is saved after running train_target_model.py. The point is that after I trained and saved the target mode by running train_target_model.py, the target test accuracy was always lower than that in Table 4. Thus, I am wondering if you could kindly share the hyperparameters of training the target models? Or if it's more convenient for you, could you share with me one pretrained target model? Thanks a lot.

Yes, the target model is trained and saved using the train_target_model.py (and the hyperparameters are the same as this file).

After the target model is trained, we perform the attack and use the same set of data (attack testing dataset) to test the performance of target model and the surrogate model. The performance in Table 4 is calculated in this way.

Thanks for your explanation.

I have used the hyperparameters in the train_target_model.py file, but the testing accuracy is still lower than expected. Is there anything else I could do to reproduce the results in Table 4? Thanks.

xinleihe commented 9 months ago

Hi, could you specify the datasets and model architectures?

We performed some tiny experiments and found that the performance was similar to Table 4.

E.g., for GAT trained on citeseer_full, when we perform the attack python3 attack.py --dataset citeseer_full --target-model-dim 256 --num-hidden 256 --target-model gat --surrogate-model sage --recovery-from prediction --query_ratio 1.0 --structure original , the target test accuracy, attack accuracy, and attack agreement are 0.916312,0.859102,0.901182, respectively. (In Table 4, the target test accuracy is 0.910.)

Also, the performance change may be caused by the randomness. So different runs may result in different performances.

xujing1994 commented 8 months ago

Hi, I tried to reproduce the experiments on all datasets and models, but the testing accuracy of the target models are generally lower than those in Table 4.

During reproducing, I noted two things about training:

  1. In the train_target_model.py file, the partition of the training, val, and testing dataset is [0.6, 0.2, 0.2], which is different from the description in the paper, i.e., [0.2, 0.3, 0.5].
  2. In the attack.py file, the graph is again split randomly into training, val, and testing dataset. Thus, it seems there is an overlap between the training and testing dataset for the target model. What’s more, there is also an overlap in the training dataset between the target and surrogate models.

Could you please clarify with what parameters I will be able to reproduce the results and comment on the overlapping splits?

maddogwithrabies commented 8 months ago

@xujing1994

I take the full responsibility for the mismatch between the hyperparameters used in the code and those specified in the paper.

  1. Kindly adjust the hyperparameters to [0.3, 0.2, 0.5], and the outcomes should closely resemble those presented in Table 4.
  2. Please note that the current code does not automatically select the best model.

For your convenience, I have provided two sample files (for illustration purposes only, use with caution) to assist you.

Additionally, it is worth noting that variations in the results (Table 4) may arise due to the random selection of training data. You can observe such discrepancies by executing the following bash command.

for i in {1..5}; do python train_target_model.py --dataset citeseer_full --target-model gat --num-hidden 256 --num-epochs=200 --eval-every=5; for j in {1..5}; do python attack.py --dataset citeseer_full --target-model-dim 256 --num-hidden 256 --target-model gat --surrogate-model sage --recovery-from prediction --query_ratio 1.0 --structure original; cat results_acc_fidelity/results_gat_256_sage_256/citeseer_full_original.txt; done; done; gnn.zip

xinleihe commented 8 months ago

@xujing1994 Sorry for the late reply. Regarding the parameters, you can consider the comment above. Regarding the overlap, we consider the attacker can sample the data from the same dataset, e.g., social networks. In this case, the sampled dataset may contain the nodes that are used to train the target model. We will also clarify the parameters and overlap in our paper.