Restricting AdversarialDebiasing's trainable variables to current scope

When attempting to use AdversarialDebiasing multiple times and resetting the tf graph or session is not an option (i.e. when using multiple instances as part of an ensemble), subsequent calls to fit() after fitting the first model will crash because the weight and bias tensors from the first model are retrieved as trainable variables and trying to compute their gradients via classifier_opt.compute_gradients(pred_labels_loss, var_list=classifier_vars) returns None tensors, even if those tensors are not necessarily in the current scope. More concretely, with scope adversarial_debiasing_1627063453.788637 used to train the first model and adversarial_debiasing_1627063456.061836 used to train the second, the gradients for the first run are

[(<tf.Tensor 'adversarial_debiasing_1627063453.788637/gradients_1/adversarial_debiasing_1627063453.788637/classifier_model/MatMul_grad/tuple/control_dependency_1:0' shape=(10, 200) dtype=float32>, <tf.Variable 'adversarial_debiasing_1627063453.788637/classifier_model/W1:0' shape=(10, 200) dtype=float32_ref>), (<tf.Tensor 'adversarial_debiasing_1627063453.788637/gradients_1/adversarial_debiasing_1627063453.788637/classifier_model/add_grad/tuple/control_dependency_1:0' shape=(200,) dtype=float32>, <tf.Variable 'adversarial_debiasing_1627063453.788637/classifier_model/b1:0' shape=(200,) dtype=float32_ref>), (<tf.Tensor 'adversarial_debiasing_1627063453.788637/gradients_1/adversarial_debiasing_1627063453.788637/classifier_model/MatMul_1_grad/tuple/control_dependency_1:0' shape=(200, 1) dtype=float32>, <tf.Variable 'adversarial_debiasing_1627063453.788637/classifier_model/W2:0' shape=(200, 1) dtype=float32_ref>), (<tf.Tensor 'adversarial_debiasing_1627063453.788637/gradients_1/adversarial_debiasing_1627063453.788637/classifier_model/add_1_grad/tuple/control_dependency_1:0' shape=(1,) dtype=float32>, <tf.Variable 'adversarial_debiasing_1627063453.788637/classifier_model/b2:0' shape=(1,) dtype=float32_ref>)]

and the ones for the second are

[(None, <tf.Variable 'adversarial_debiasing_1627063453.788637/classifier_model/W1:0' shape=(10, 200) dtype=float32_ref>), (None, <tf.Variable 'adversarial_debiasing_1627063453.788637/classifier_model/b1:0' shape=(200,) dtype=float32_ref>), (None, <tf.Variable 'adversarial_debiasing_1627063453.788637/classifier_model/W2:0' shape=(200, 1) dtype=float32_ref>), (None, <tf.Variable 'adversarial_debiasing_1627063453.788637/classifier_model/b2:0' shape=(1,) dtype=float32_ref>), (<tf.Tensor 'adversarial_debiasing_1627063456.061836/gradients_1/adversarial_debiasing_1627063456.061836/classifier_model/MatMul_grad/tuple/control_dependency_1:0' shape=(10, 200) dtype=float32>, <tf.Variable 'adversarial_debiasing_1627063456.061836/classifier_model/W1:0' shape=(10, 200) dtype=float32_ref>), (<tf.Tensor 'adversarial_debiasing_1627063456.061836/gradients_1/adversarial_debiasing_1627063456.061836/classifier_model/add_grad/tuple/control_dependency_1:0' shape=(200,) dtype=float32>, <tf.Variable 'adversarial_debiasing_1627063456.061836/classifier_model/b1:0' shape=(200,) dtype=float32_ref>), (<tf.Tensor 'adversarial_debiasing_1627063456.061836/gradients_1/adversarial_debiasing_1627063456.061836/classifier_model/MatMul_1_grad/tuple/control_dependency_1:0' shape=(200, 1) dtype=float32>, <tf.Variable 'adversarial_debiasing_1627063456.061836/classifier_model/W2:0' shape=(200, 1) dtype=float32_ref>), (<tf.Tensor 'adversarial_debiasing_1627063456.061836/gradients_1/adversarial_debiasing_1627063456.061836/classifier_model/add_1_grad/tuple/control_dependency_1:0' shape=(1,) dtype=float32>, <tf.Variable 'adversarial_debiasing_1627063456.061836/classifier_model/b2:0' shape=(1,) dtype=float32_ref>)]

Note the None tensors corresponding to variables in scope adversarial_debiasing_1627063453.788637 that appear in the second list.

The fix in this PR is to make sure classifier_vars and adversary_vars do not have variables beyond the current scope, which can be done by passing scope=self.scope_name as parameter to the tf.trainable_variable() calls in each case.

Trusted-AI / AIF360

Restricting AdversarialDebiasing's trainable variables to current scope #255