Open hbrachemi opened 1 year ago
Hi,
Thank you for your interest in the project. To get the hyperparameter configurations, please consider the following options:
You can refer to the Arxiv paper, specifically the section titled "Models and Training Phase." The hyperparameter configurations for training are detailed there.
Another approach is to inspect the slurm files in sponge_poisoning_energy_latencyattack/slurm/{dataset}/run{model}.slurm. These files contain information such as the maximum number of epochs, batch size, scheduler details, and hyperparameters for both the clean and sponge models (if applicable).
Alternatively, you can find the hyperparameter configurations for the optimizer and scheduler in the sponge_poisoning_energy_latency_attack/blob/master/forest/hyperparameters.py file. Look for the section labeled "SPONGE_EXPONENTIAL" to access the relevant configurations. Please note indeed that in the slurm files, each command includes the option "--optimization='sponge_exponential'."
To ensure a fair comparison and avoid potential discrepancies, we have trained both the clean and sponge models using the same hyperparameters in the training algorithm.
Please let me know if there is anything else I can help you with.
Hey, thanks again for your answer. I might have another question regarding the poisoning rate though. Since the attack doesn't really alter the inputs or depend on them as in classical poisoning attacks but is more an optimization formulation of the goal, and considering the fact that the attacker is the one who provides the new weights and has full control on the model's updates what's exactly the point of fixing p?
Hi, thank you again for your interest. We have deliberately chosen to retain the parameter "p" to illustrate that poisoning all gradient updates is unnecessary. Surprisingly, only a few percent of them are sufficient to trigger the sponge attack. This discovery holds value in other application scenarios, e.g., federated learning.
Hi again, Thank you again, I have another question regarding the estimation of the energy in the baseline; Do you estimate the energy by batch inference and then compute the average or do you use a batch size = 1 when estimating it ?
Hi again, Thank you again, I have another question regarding the estimation of the energy in the baseline; Do you estimate the energy by batch inference and then compute the average or do you use a batch size = 1 when estimating it ?
The energy is estimated over the total batch with the batch size of the set data loader. If you want to get the energy for 1 sample you have to either manually feed the images or set the dataloader you give to batch size 1
Hello,
I am trying to reproduce the same experiment results as the ones reported on the paper, for this purpose would it be possible to provide me with hyperparametters (learning rate, optimizer, callbacks used if any, number of epochs...) used to train the clean model?
Thanks in advance for the assistance