Azure / counterfit

a CLI that provides a generic automation layer for assessing the security of ML models
MIT License
800 stars 128 forks source link

Info on parameter ranges #7

Closed dsvrsec closed 3 years ago

dsvrsec commented 3 years ago

Can you please mention on what basis the parameter ranges are considered for attacks( for hyperopt)

moohax commented 3 years ago

Thanks for your issue.

Currently, you can get the min/max by looking in the attack class in counterfit/frameworks/*, under random. Here is the Boundary attack from ART.

from counterfit.core.attacks import Attack
from hyperopt import hp

from art.attacks.evasion import BoundaryAttack

class BoundaryAttackWrapper(Attack):
    attack_cls = BoundaryAttack
    attack_name = "boundary"
    attack_type = "evasion"
    tags = ["image", "numpy"]
    category = "blackbox"
    framework = "art"

    random = {
        "targeted": hp.choice("bound_targ", [False, True]),
        "delta": hp.uniform("bound_delta", 0.005, 0.05),
        "epsilon": hp.uniform("bound_uniform", 0.005, 0.05),
        "step_adapt": hp.uniform("bound_adapt", 0.5, 0.75),
        "max_iter": hp.quniform("bound_maxiter", 200, 2000, 1),
        "num_trial": hp.quniform("bound_trial", 10, 50, 1),
        "sample_size": hp.quniform("bound_ssize", 10, 50, 1),
        "init_size": hp.quniform("bound_isize", 10, 200, 1),
    }

Though, the actual limits and any errors you would see come from the respective framework code. In some attacks the limits are found in the doc strings, other times there is an explicit code check, other times the code will just fail. Once we aggregate the limits, we'll make checks.

dsvrsec commented 3 years ago

Thank you.I was able to find this file,but Is there any reference for the ranges or have you mentioned these ranges based on the experiments done? I could only find the default values in the attack files in ART framework,but not the ranges mentioned in random dict.Can you please give reference of the ranges for any attack if possible.

dsvrsec commented 3 years ago

@moohax > Thank you.I was able to find this file,but Is there any reference for the ranges or have you mentioned these ranges based on the experiments done?

I could only find the default values in the attack files in ART framework,but not the ranges mentioned in random dict.Can you please give reference of the ranges for any attack if possible.