I find this really usefull when working with different datasets. Spark has standard way of defining dataset schema as json. Sometimes to track schema changes what we really need is to merge old and new schemas to have umbrela schema to sport everything. That is ending up to merge of two dictionaries. More particularly merjing arrays of dicts by some key identifier.
I find this really usefull when working with different datasets. Spark has standard way of defining dataset schema as json. Sometimes to track schema changes what we really need is to merge old and new schemas to have umbrela schema to sport everything. That is ending up to merge of two dictionaries. More particularly merjing arrays of dicts by some key identifier.