Open phact opened 2 years ago
Related: #448.
I don't think tuning splits would make a big difference, and btw, that's near impossible since the splits determine how many taken ranges are going to be read, so this happens at a very early phase.
But tuning throughput, yes, definitely. Probably based on latencies, and probably governed by a high/low watermark system.
I don't think tuning splits would make a big difference
It does. This is how I've had to do things many times when dsbulk unload fails.
The reason is usually a big partition, smaller splits can help it actually finish. Sometimes if that doesn't do the trick we end up having to bisect the range around it and then throttle.
Users dumping entire tables often hit timeouts when they reach large partitions. The solution is to manually tune splits and throughput until the unload works but this is very time consuming and error prone.
Would be great if dsbulk could handle this common scenario by itself.
┆Issue is synchronized with this Jira Task by Unito