Spark supports field metadata, which is a dictionary of key-value pairs you can associate with any field. We have a use case where we'd like to compare two schemas (or dataframes) where they differ only in their field metadata. This currently fails because the schema comparer doesn't account for this. This PR adds a flag ignore_metadata to the relevant functions (particularly assert_df_equality and assert_schema_equality) that allows us to set this flag to true to ignore differences in field metadata.
Spark supports field metadata, which is a dictionary of key-value pairs you can associate with any field. We have a use case where we'd like to compare two schemas (or dataframes) where they differ only in their field metadata. This currently fails because the schema comparer doesn't account for this. This PR adds a flag
ignore_metadata
to the relevant functions (particularlyassert_df_equality
andassert_schema_equality
) that allows us to set this flag to true to ignore differences in field metadata.