Open henrikfo opened 5 months ago
An idea, what if the shadow models are trained pair-wise? For all samples in the dataset, each pair tosses a coin to see which model will train on a sample. The resulting models will yeild a random trainingset, no unnecessary logic for adding more shadow models and if a odd number of models is to be used, no matter if its 3 to 2*n+1, all samples will have corresponding IN- and OUT models
fazelehh 2 hours ago Here we still include points with only one shadow model, right? Should we do that even if the number of shadow models is a large number?
Member Author @johanos1 johanos1 1 hour ago I think that something we can discuss but then the follow-up question is: given n models, how many in/out models should we require? This code only ensures that the audit dataset is actually auditable.
Collaborator @fazelehh fazelehh 42 minutes ago I can answer this question with two different objectives:
Maximizing the Number of Auditable Data Points: This is essentially what we are doing now. If a data point has at least one shadow model, we audit it.
Ensuring Reliable Audit Points: We can provide recommendations to the user for more reliable audits. For instance, if the ratio of data points for each shadow model and the number of shadow models results in fewer than (let's say) 4 shadow models per data point, we can warn the user. In the first principle paper, good results were observed with around eight in and eight out shadow models per data. Alternatively, we can calculate the mean number of shadow models per audit point and filter out points below this mean or the first quartile. Whatever we do here might be model, and data-dependent.
One could perhaps (over)engineer something like this
let alpha = min(train_size/audit_size, 1-train_size/audit_size) let n = int(1/alpha)
We define a shuffle_or_roll object with an internal counter i of how many times it was called
It shuffles indices every n-time it is called, otherwise just roll indices alpha*audit_size steps.
Then it returns the split into IN and OUT on the indices.
This would achieve Henrik's coin toss for alpha = 0.5, but should also work for other fractions.
Feature
Desired Behavior / Functionality
Right now we are sampling data for each shadow model when we are creating them Another idea is to sample model indices to chooses which models will train on a particular data point, for each data point
What Needs to Be Done
Reason about the pros and cons of swtiching or maintain the way it's done currently