tensorflow / model-remediation

Model Remediation is a library that provides solutions for machine learning practitioners working to create and train models in a way that reduces or eliminates user harm resulting from underlying performance biases.
https://www.tensorflow.org/responsible_ai/model_remediation?hl=en
Apache License 2.0
43 stars 19 forks source link

Infer_Schema in FDW Utils: Can't handle features with no shape info #32

Open CLSchmitz opened 2 years ago

CLSchmitz commented 2 years ago

Infer_schema always creates FixedLenFeatures and populates their shape with the shape from the dataset's element_spec. FixedLenFeatures require a shape, but in the element_spec that shape is often None, I assume because the feature was eg. a VarLenFeature or FixedLenSequenceFeature. This then creates a parse error, eg:

ValueError: First dimension of shape for feature attributes/5_o_Clock_Shadow unknown. Consider using FixedLenSequenceFeature. Received feature=FixedLenFeature(shape=TensorShape([None]), dtype=tf.int64, default_value=None).

Note this error isn't thrown until trying to use the schema to parse the TFRecords after calling FDW.

The error can be reproduced by running FDW in this colab without passing a manually created schema to dataset_schema_override.

Edit: When the dataset is not batched, this error doesn't happen.

Ali-Maq commented 1 year ago

Hey @CLSchmitz

Can you provide access to the colab , so I can review it, Would like to help to fix the issue!