Azure / counterfit

a CLI that provides a generic automation layer for assessing the security of ML models
MIT License
805 stars 129 forks source link

support for non probability outputs #21

Closed priamai closed 2 years ago

priamai commented 3 years ago

Hi there, very nice project, is there a plan to implement also attacks that works on the label output (no probabilities) or limited API query setting? There was an article here a while ago which would be nice to have. Is this something that should be implemented in the ART component?

priamai commented 3 years ago

Okay maybe I am not interpreting the documentation right, for example the HopSpikJumpAttack works on predicted labels not probabilities, but in the creditfraud example: https://github.com/Azure/counterfit/blob/main/demo/WEBINAR-DEMO-2.md the target is designed with out probabilities. Would be nice to get some clarity there.

moohax commented 3 years ago

You effectively translate probabilities to labels depending on what a model gives you back in outputs_to_labels.

A good example here in the wiki . TextAttack requires a numerical value in model_output_classes, and ART will work on with a text label of a numerical label.

priamai commented 3 years ago

Hi there, that I understand but is there an example where the target outputs a label (not the probability) ? The creditcard example provides output probabilities. The HopSkipJump should work directly with binary labels and not probabilities.

moohax commented 3 years ago

Set your outputs to [0, 1] where 1 is the positive class.

priamai commented 2 years ago

Will try that thanks.