Azure / azure-cosmosdb-spark

Apache Spark Connector for Azure Cosmos DB
MIT License
201 stars 120 forks source link

Config to enable converting nested docs that are derived as string to the native jason format #427

Closed revinjchalil closed 3 years ago

revinjchalil commented 3 years ago

The nested fields in doc such as {key1: value1, key2: {key21: value21, key22: value22}} that are derived as string since they could be of different data types, get converted to {key1: value1, key2: {key21 = value21, key22 = value22}} when written out as parquet. The hive json parser functions such as "get_json_object" is not able to parse this and needs to be in standard json format.

Added the boolean config "ConvertNestedDocsToNativeJsonFormat" with default of false to enable this.