Closed AI-WAIFU closed 1 year ago
This new work may have some relevance: https://arxiv.org/abs/2204.12301
We design new visual illusions by finding "adversarial examples" for principled models of human perception -- specifically, for probabilistic models, which treat vision as Bayesian inference. To perform this search efficiently, we design a differentiable probabilistic programming language, whose API exposes MCMC inference as a first-class differentiable function. We demonstrate our method by automatically creating illusions for three features of human vision: color constancy, size constancy, and face perception.
Project: Better Concept Pointers
Elevator Pitch
There's an argument that there's a correspondence between the pointers problem, Goodhart's law and adversarial examples.[1] It stands to reason that progress on the latter can be used to develop techniques for dealing with the latter.
Goal Outputs
The goal is to develop a technique for robustly generating "real" adversarial examples. Image perturbations that change a classifier's output, and how a human would label the output.
Milestones
How to Help: There are a few things that can be done in parallel.
Desired Support: Desired resources are:
[1] Waifu et al. when he get around to doing a write up.