Conceptual Analysis of Training Time Domain Authorization [Jan/David/Dom]

Old Notes:

Not really a code task but this task involves starting an overleaf (Let's use the ICLR template 🤞 ) and really thinking through and formalizing https://docs.google.com/document/d/1YW26CdAKv06uc2CN09vf-w9RpkQbYU5yBMkaQBWkH84/edit (doesn't have to be very mathematical at first can be quite conceptual)

This is a lot like the analysis we developed in: https://arxiv.org/abs/2402.16382 but instead of necessary and suffecient conditions for a defence in ML security language we want the necessary and suffecient conditions for Domain Authorization including:

What is domain authorization generally? What about inference and training time domain authoizaton (how are they different how are they related, how are they defined)
What are the different scienarios of domain authorization (only training allowed in one or multiple domain, only disallowing one or multiple domains)
What would make someone convienced a model is domain authorized
What is a domain? how is it estimated? how is it different from a task or behaviour?
How do we formulate this outside of ML securit language (threat model, attack, defence)
What is the benchmark metric

Outcome:

The beginning of our paper! A clear set of conditions which we will use to guide our benchmark construction. Ideally a reader reading this would say "Ok if a benchmark measured these things I would be convienced this domain authorization method works well and I would use it in industry to domain authorize my model"

Meta: Ideally someone other than Dom takes a first pass since Dom is too opinionated and has many gaps.

domenicrosati / training-time-domain-authorization

Conceptual Analysis of Training Time Domain Authorization [Jan/David/Dom] #6

Issue

ToDo:

Old Notes:

Outcome: