Model Remediation is a library that provides solutions for machine learning practitioners working to create and train models in a way that reduces or eliminates user harm resulting from underlying performance biases.
The tf_dataset_to_tf_examples_list function in fdw utils here can only handle datasets where each element is just a neat single-layer dict of format {feature_name: tf.Tensor}. The easiest way to generate one of these is from a dataframe, using eg. tf.data.Dataset.from_tensor_slices(dict(df)).
Specifically, this means it fails at handling tf.data.Datasets that have one of two properties:
Nested features. Many TFDS datasets come with nested features, eg. CelebA. In these cases, at least one element of the feature dict is of form {feature_class_name: {feature_1: tf.Tensor, feature_2: tf.Tensor, ...}}. These would need to be flattened.
import tensorflow_datasets as tfds
from tensorflow_model_remediation.experimental import fair_data_reweighting as fdw
ds = tfds.load('celeb_a')
ex = fdw.utils.tf_dataset_to_tf_examples_list(ds['train'])
next(ex)
throws AttributeError: 'dict' object has no attribute 'numpy'.
Loaded as supervised. tfds.load() allows an as_supervised parameter. If set to true, each element of the dataset is a tuple, with the first element the features dict and the second element the label as a tf.Tensor. The label is not repeated in the features dict.
import tensorflow_datasets as tfds
from tensorflow_model_remediation.experimental import fair_data_reweighting as fdw
ds = tfds.load('diamonds', as_supervised = True)
ex = fdw.utils.tf_dataset_to_tf_examples_list(ds['train'])
next(ex)
throws AttributeError: 'tuple' object has no attribute 'items'.
The tf_dataset_to_tf_examples_list function in fdw utils here can only handle datasets where each element is just a neat single-layer dict of format
{feature_name: tf.Tensor}
. The easiest way to generate one of these is from a dataframe, using eg.tf.data.Dataset.from_tensor_slices(dict(df))
.Specifically, this means it fails at handling
tf.data.Dataset
s that have one of two properties:{feature_class_name: {feature_1: tf.Tensor, feature_2: tf.Tensor, ...}}
. These would need to be flattened.throws
AttributeError: 'dict' object has no attribute 'numpy'
.tfds.load()
allows anas_supervised
parameter. If set to true, each element of the dataset is a tuple, with the first element the features dict and the second element the label as a tf.Tensor. The label is not repeated in the features dict.throws
AttributeError: 'tuple' object has no attribute 'items'
.