Open lgo opened 4 years ago
This is due to we upload segments to all the controller hosts and the idealStats update requests coming from all controllers will cause the slowness and update conflicts.
This issue wasn't there as controller needs to download the segment tar and untar the metadata then do the update, so it's a costly behavior in controller. With segment metadata only push mode, we may need to rethink this.
cc: @siddharthteotia @mayankshriv @snleee
Not sure what you mean by rethink. A metadata only or URI push is a cheaper operation, so there is less likelyhood of contention. We can also make the backoff be in smaller increments?
Here was the stripped down jobSpec we are using for reference
executionFrameworkSpec:
name: spark
segmentMetadataPushJobRunnerClassName: org.apache.pinot.plugin.ingestion.batch.spark.SparkSegmentMetadataPushJobRunner
segmentGenerationJobRunnerClassName: org.apache.pinot.plugin.ingestion.batch.spark.SparkSegmentGenerationJobRunner
segmentUriPushJobRunnerClassName: org.apache.pinot.plugin.ingestion.batch.spark.SparkSegmentUriPushJobRunner
jobType: SegmentCreationAndMetadataPush
overwriteOutput: true
pinotFSSpecs:
- scheme: s3
className: org.apache.pinot.plugin.filesystem.S3PinotFS
configs:
region: ...
recordReaderSpec:
dataFormat: ...
className: ...
tableSpec:
tableName: ...
pinotClusterSpecs:
- controllerURI: ...
segmentNameGeneratorSpec:
type: normalizedDate
configs:
segment.name.prefix: ...
pushJobSpec:
segmentUriPrefix: ...
segmentUriSuffix: ''
pushParallelism: 5
pushAttempts: 5
pushRetryIntervalMillis: 3000
And, a few relevant chunks from the pinot server conf
pinot.server.instance.enable.split.commit=true
As well as the controller conf.
controller.enable.split.commit=true
An improvement for this: https://github.com/apache/incubator-pinot/pull/6165 This will limit the idealstates update parallelism to at most the number of pinot-controllers.
On batch jobs processing lots of segments for a table, they often run into Zookeeper conflicts when updating idealState. This causes contention on updates, slowing down everything. To resolve that the pushParallelism on a job spec had to be reduced to ~5, so that it would conflicts less recently. There was another issue y'all resolved which helped ensure progress happened on conflicts.
Being able to upload segments faster will drastically reduce the burden for operating Pinot (backfilling, large segment uploads, or for adjusting existing data).