pingcap / tispark

TiSpark is built for running Apache Spark on top of TiDB/TiKV
Apache License 2.0
880 stars 243 forks source link

[BUG] The unique key and replace options do not take effect when using tispark to write data #2746

Closed liangjihua closed 9 months ago

liangjihua commented 11 months ago

Describe the bug Use tispark to write data to tikv, set replace to true, the unique field have duplicate values

What did you do DDL:

CREATE TABLE `user_info` (
   `name` varchar(64) NOT NULL ,
  `address` varchar(200),
   UNIQUE KEY `name` (`name`)
);
df = spark.read.parquet('s3a://user.parquet').createOrReplaceTempView('user')
df.write \
         .format("tidb") \
         .option("database", "test") \
         .option("table", 'user_info') \
         .option('replace', True) \
         .mode("append")\
         .save()
df2 = spark.read.parquet('s3a://user.parquet').createOrReplaceTempView('user')
df2.write\
         .format("tidb") \
         .option("database", "test") \
         .option("table", 'user_info') \
         .option('replace', True) \
         .mode("append")\
         .save()

When I write s3a://user.parquet twice into user_info, the records with the same name field are not replaced, but there are two identical records

github-actions[bot] commented 10 months ago

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] commented 9 months ago

This issue was closed because it has been inactive for 14 days since being marked as stale.