Closed GoogleCodeExporter closed 8 years ago
It's found by a use case in Genomix. Should be an easy fix (finding duplicates
after sorting the records in bulk load)?
Original comment by che...@gmail.com
on 16 Apr 2013 at 7:10
We have this already. The bulk load operator has a parameter called
"verifyInput" which will check that tuple i+1 is strictly greater than tuple i.
This means that any duplicates will throw an exception.
Original comment by zheilb...@gmail.com
on 16 Apr 2013 at 7:23
If your input is sorted (and it should be), then as Zack said, by passing true
for verifyInput, the bulk load will throw an exception when it detects a
duplicate. Can you double check:
1) the input stream is sorted.
2) the verifyInput parameter, that is passed to the bulkload operator, is set
to true
Original comment by salsuba...@gmail.com
on 16 Apr 2013 at 7:50
Ok, sounds good. Let me close this issue.
Original comment by buyingyi@gmail.com
on 16 Apr 2013 at 7:52
Original issue reported on code.google.com by
buyingyi@gmail.com
on 16 Apr 2013 at 6:54