Model Remediation is a library that provides solutions for machine learning practitioners working to create and train models in a way that reduces or eliminates user harm resulting from underlying performance biases.
Infer_schema always creates FixedLenFeatures and populates their shape with the shape from the dataset's element_spec. FixedLenFeatures require a shape, but in the element_spec that shape is often None, I assume because the feature was eg. a VarLenFeature or FixedLenSequenceFeature. This then creates a parse error, eg:
ValueError: First dimension of shape for feature attributes/5_o_Clock_Shadow unknown. Consider using FixedLenSequenceFeature. Received feature=FixedLenFeature(shape=TensorShape([None]), dtype=tf.int64, default_value=None).
Note this error isn't thrown until trying to use the schema to parse the TFRecords after calling FDW.
The error can be reproduced by running FDW in this colab without passing a manually created schema to dataset_schema_override.
Edit: When the dataset is not batched, this error doesn't happen.
Infer_schema always creates FixedLenFeatures and populates their shape with the shape from the dataset's element_spec. FixedLenFeatures require a shape, but in the element_spec that shape is often None, I assume because the feature was eg. a VarLenFeature or FixedLenSequenceFeature. This then creates a parse error, eg:
ValueError: First dimension of shape for feature attributes/5_o_Clock_Shadow unknown. Consider using FixedLenSequenceFeature. Received feature=FixedLenFeature(shape=TensorShape([None]), dtype=tf.int64, default_value=None).
Note this error isn't thrown until trying to use the schema to parse the TFRecords after calling FDW.
The error can be reproduced by running FDW in this colab without passing a manually created schema to dataset_schema_override.
Edit: When the dataset is not batched, this error doesn't happen.