Closed bithw1 closed 2 weeks ago
@bithw1 because you set precombine field, you can set
set hoodie.combine.before.insert = false;
Eliminate the primary key definition, that is what we call a pk-less table,
@bithw1 because you set precombine field, you can set
set hoodie.combine.before.insert = false;
Thanks, but it doesn't work for me..actually, hoodie.combine.before.insert
is false by default.
public static final ConfigProperty<String> COMBINE_BEFORE_INSERT = ConfigProperty
.key("hoodie.combine.before.insert")
.defaultValue("false")
.markAdvanced()
.withDocumentation("When inserted records share same key, controls whether they should be first combined (i.e de-duplicated) before"
+ " writing to storage.");
Eliminate the primary key definition, that is what we call a pk-less table,
Thanks, I tried and it works for me!
So, can i conclude that with pk definition and precombine field, insert operation will work like upsert?
@bithw1 the default value will be modified when running job if not specified.
@bithw1 the default value will be modified when running job if not specified.
@KnightChess I have set this option explicitly per your guide,but I still see the same result(one updated record instead of two records)
@bithw1 my mistake, the hoodie.combine.before.insert
is controll the imcoming records, your insert is two sql. you can set set hoodie.spark.sql.insert.into.operation = insert
for pk table too.
Thanks @KnightChess
Hi,
I am using Hudi 0.15.0, In the spark-sql cli, I do the following things. I insert two records with two insert clause, I think there will be two records(with same id),but only one is left there, looks Hudi does upsert instead of insert here,
The default behavior of insert into is insert, so I don't understand how upsert happens here