twosixlabs / armory

ARMORY Adversarial Robustness Evaluation Test Bed
MIT License
174 stars 67 forks source link

Yet another sleeper agent bug fix #1886

Closed swsuggs closed 1 year ago

swsuggs commented 1 year ago

Fixes bug in sleeper agent scenario where it was incorrectly finding the indices of poisoned data.

Also includes a couple less crucial improvements (invalid variable name in infrequently-reached code; improved parameter values for dp-instahide baseline configs)