Closed vincent-octo closed 1 month ago
Yep. Gotta make a wdl where I split the files, sort the chunks, join them back and then I can easily remove duplicate entries. The munging should preserve rhe order so we can also check for duplicates after munging if needed.
Closing as it's now addressed with a WDL.
One thing that Kira meant to do but didn't get to it was: identify duplicate records based on some specific set of OIDs.
I believe that following columns should be looked at together to identify additional duplicate rows:
asiakirjaoid
,merkintaoid
,entryoid
.Maybe best to discuss with Kira about it to get more clarity.