Closed kamtungc closed 2 years ago
This error
{"ClassName":"BulkOperationFailedException","userAgent":"azsdk-java-cosmos/4.32.0-snapshot.1 Linux/5.4.0-1080-azure JRE/1.8.0_302","statusCode":**409**,"resourceAddress":null,"innerErrorMessage":"All retries exhausted for 'UPSERT' bulk operation - statusCode**=[409:0]** itemId=[e75841c9-534b-423f-b397-afbc2ad7d427], partitionKeyValue=
for an upsert operation means that you have a unique key constraint configured on your container and at least one document you try to insert is violating the unique key constraint.
--> see this section of the Ingestion best practices for more details.
You need to fix the input data to not violate the unique key constraint before you can ingest it.
I got "Writing job aborted" error when upserting data into a Cosmos DB
Version: com.azure.cosmos.spark:azure-cosmos-spark_3-2_2-12:4.11.1
Configuration: COSMOS_CFG = { "spark.cosmos.accountEndpoint" : COSMOS_ENDPOINT, "spark.cosmos.accountKey" : COSMOS_MASTERKEY, "spark.cosmos.database" : COSMOS_DATABASE, "spark.cosmos.container" : COSMOS_CONTAINER, "spark.cosmos.write.strategy": "ItemOverwrite", "spark.cosmos.write.bulk.enabled": "true", "spark.cosmos.throughputControl.name": COSMOS_CONTAINER + "DataIngestion", "spark.cosmos.throughputControl.targetThroughputThreshold": "0.95", "spark.cosmos.throughputControl.globalControl.database": COSMOS_DATABASE, "spark.cosmos.throughputControl.globalControl.container": "ThroughputControl", }
Function: def update_entities(df): print("Starting ingestion: ", datetime.datetime.utcnow().strftime("%Y-%m-%d %H:%M:%S.%f")) df .write .format("cosmos.oltp") .options(**COSMOS_CFG) .mode("APPEND") .save() print("Finished ingestion: ", datetime.datetime.utcnow().strftime("%Y-%m-%d %H:%M:%S.%f")) return
Error:
Py4JJavaError Traceback (most recent call last)