Do we expect the controllers under heavy load if we continue the present behaviour? We are trying to use UpsertCompaction at scale in Uber and are a bit skeptical that if we enable it for a lot of tables with the present config, the controllers might be under heavy resource utilization. We run our clusters with very minimal controller nodes currently.
One of the ideas to resolve this:
Can we allow minion-nodes to re-upload segments to deepstore post processing? Maybe behind a config initially - allowDeepstoreUpload? Or is this considered like any anti-pattern? At present, minion nodes do interact with deepstore to download the segment file before processing.
@tibrewalpratik17 Actually all controller regular jobs should have be placed in minion. I also have a requirement to place the job of deleting tmp deep store files in minion for better scaling up.
Currently, when a minion node processes a segment it calls
uploadSegmentAsMultiPart
API of controller. The controller processes and uploads the segment to deepstore and then issues a refresh call to the respective servers. https://github.com/apache/pinot/blob/4abb2d18f733781539d2d72ab75e1bb03c197489/pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/PinotSegmentUploadDownloadRestletResource.java#L546Raising this issue to gather the following context:
Why didn't we allow the minion nodes to upload the files to deepstore (just like servers do)? And then use uploadtype as URI to refresh data from deepstore to servers via controllers? https://github.com/apache/pinot/blob/4abb2d18f733781539d2d72ab75e1bb03c197489/pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/PinotSegmentUploadDownloadRestletResource.java#L507
Do we expect the controllers under heavy load if we continue the present behaviour? We are trying to use UpsertCompaction at scale in Uber and are a bit skeptical that if we enable it for a lot of tables with the present config, the controllers might be under heavy resource utilization. We run our clusters with very minimal controller nodes currently.
One of the ideas to resolve this:
allowDeepstoreUpload
? Or is this considered like any anti-pattern? At present, minion nodes do interact with deepstore to download the segment file before processing.cc @ankitsultana