Closed tub closed 6 years ago
Thanks, looks awesome! I'd like to write few tests for current version before merging PR to make sure nothing is broken, using https://www.testcontainers.org. Probably will do it on a weekend.
I've added a bunch of JUnit unit-tests around the code we've changed. Not yet looked into the testcontainers setup, but it looks interesting. The unit tests required a bit of refactoring in order to pass dependencies in.
We (myself and @ronin13) have also worked on cleaning up the zookeeper server object once zkcopy is finished, and raised the session timeout so that large transactions and reads don't fail.
I've just added an initial testcontainers integration test. Really nice library, thanks for the headsup @kshchepanovskyi The new itest highlighted that the connections to the source ZK were still not being closed so I've scrapped the finalize() implementation and exposed a way of closing the ZooKeepers from the ThreadFactory.
This PR adds transaction support to ZK copy, allowing the user to specify the number of operations in the transaction with -b/--batchSize.
We've tested this with up to 10,000 operations per transaction with 100,000 nodes and seen around a 6x write speedup.
Another optimization for users wanting to perform a copy multiple times is the -m/--mtime flag, which allows zkcopy to ignore nodes older than the specified mtime, skipping any writes that may have been performed by a previous run of zkcopy. To aid this, zkcopy logs the maximum mtime of the copied nodes at runtime.
This PR also refactors the CLI arg parsing to use picocil, removing a lot of manual parsing and type-checking.