We have choosen to split files not by their size, but just numbers of partitions. This is probably not totally scalable, but does cut down a lot of time required.
Tune for the scheduler: none, local, scheduler = 20, 1, 3
Added a few status messages so you know number of files and the file split number that was actually used.
We have choosen to split files not by their size, but just numbers of partitions. This is probably not totally scalable, but does cut down a lot of time required.
Fixes #63