Closed brayanjuls closed 1 year ago
I think it will be interesting to add optionnal parameter ["PathRejects"], to write deduplicated rows, if we need to do some analyse of DataQuality when we have DuplicatedRow from source.
And also return count of rows inserted, Updates, rejected.
@ilyasse05 - That seems to be a good feature to me, please open a new issue to brainstorm there how we can implement this.
Duplication is allowed when the duplication happens in the dataframe and is not in the table. I.E
Let's say we have the following table:
And we want to insert this new dataframe:
Calling the function with the following parameters will not avoid duplication in the table:
The resulting table will be:
We should also deduplicate the dataframe before trying to append the new data.