apache / pinot

Apache Pinot - A realtime distributed OLAP datastore
https://pinot.apache.org/
Apache License 2.0
5.25k stars 1.23k forks source link

Issue with segment commit while copying segments from deepstore #13508

Open vineethvp opened 3 days ago

vineethvp commented 3 days ago

I'm using s3 as deepstorage for pinot cluster.

Segments directly uploaded from server to s3. below settings are used in table config

"segmentsConfig": { "peerSegmentDownloadScheme": "https" }

"tableIndexConfig": { "loadMode": "MMAP", "streamConfigs": {"realtime.segment.serverUploadToDeepStore": "true"} }

Below error is coming during segment commit. Segments are getting created in s3, but seems the copying back to controller is failing.

Response to segmentCommitEndWithMetadata for segment:kpi_events3020240628T0911Z is:{"status":"FAILED","buildTimeSec":-1,"streamPartitionMsgOffset":null,"isSplitCommitType":true} Handled request from 10.3.142.240 POST http://pinot-controller-0.pinot-controller-headless.pinot-config-alerts.svc.cluster.local:9000/segmentCommitEndWithMetadata?segmentSizeBytes=7601023&reason=rowLimit&buildTimeMillis=1066&streamPartitionMsgOffset=407132319&instance=Server_pinot-server-2.pinot-server-headless.pinot-config-alerts.svc.cluster.local_8098&offset=-1&name=kpi_events__3__0__20240628T0911Z&location=s3%3A%2F%2Fags-thd-prod-us1-config-alerts-ds%2Fpinot-data%2Fpinot-config-alerts%2Fcontroller-data%2Fkpi_events%2Fkpi_events__3__0__20240628T0911Z.tmp.b5590faa-1f5b-467f-a536-09a944c24017&rowCount=100000&memoryUsedBytes=35377510, content-type multipart/form-data; boundary=Hf0oVu8KI_TKshp68D4KwxAhdhwDNHpDepHM status code 200 OK Processing segmentCommitEndWithMetadata:Offset: -1,Segment name: kpi_events30020240628T0911Z,Instance Id: Server_pinot-server-0.pinot-server-headless.pinot-config-alerts.svc.cluster.local_8098,Reason: rowLimit,NumRows: 100000,BuildTimeMillis: 1598,WaitTimeMillis: 0,ExtraTimeSec: -1,SegmentLocation: s3://ags-thd-prod-us1-config-alerts-ds/pinot-data/pinot-config-alerts/controller-data/kpi_events/kpi_events30020240628T0911Z.tmp.0c1b5742-e24b-4d58-8875-8cca86e57a39,MemoryUsedBytes: 35375741,SegmentSizeBytes: 7694182,StreamPartitionMsgOffset: 419289205 Processing segmentCommitEnd(Server_pinot-server-0.pinot-server-headless.pinot-config-alerts.svc.cluster.local_8098, 419289205) Committing segment kpi_events30020240628T0911Z at offset 419289205 winner Server_pinot-server-0.pinot-server-headless.pinot-config-alerts.svc.cluster.local_8098 Committing segment file for segment: kpi_events30020240628T0911Z Source s3://ags-thd-prod-us1-config-alerts-ds/pinot-data/pinot-config-alerts/controller-data/kpi_events/kpi_events30020240628T0911Z.tmp.0c1b5742-e24b-4d58-8875-8cca86e57a39 does not exist Caught exception while committing segment file for segment: kpi_events300__20240628T0911Z java.lang.IllegalStateException: Failed to move segment file for segment: kpi_events300__20240628T0911Z from: s3://ags-thd-prod-us1-config-alerts-ds/pinot-data/pinot-config-alerts/controller-data/kpi_events/kpi_events30020240628T0911Z.tmp.0c1b5742-e24b-4d58-8875-8cca86e57a39 to: file:/var/pinot/controller/data/kpi_events/kpi_events30020240628T0911Z at org.apache.pinot.shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:857) ~[pinot-all-1.2.0-SNAPSHOT-jar-with-dependencies.jar:1.2.0-SNAPSHOT-87f0c8089fd35eadcbbf29249a8993a47969d594] at org.apache.pinot.controller.helix.core.realtime.PinotLLCRealtimeSegmentManager.moveSegmentFile(PinotLLCRealtimeSegmentManager.java:1749) ~[pinot-all-1.2.0-SNAPSHOT-jar-with-dependencies.jar:1.2.0-SNAPSHOT-87f0c8089fd35eadcbbf29249a8993a47969d594] at org.apache.pinot.controller.helix.core.realtime.PinotLLCRealtimeSegmentManager.commitSegmentFile(PinotLLCRealtimeSegmentManager.java:476) ~[pinot-all-1.2.0-SNAPSHOT-jar-with-dependencies.jar:1.2.0-SNAPSHOT-87f0c8089fd35eadcbbf29249a8993a47969d594] at org.apache.pinot.controller.helix.core.realtime.SegmentCompletionManager$SegmentCompletionFSM.commitSegment(SegmentCompletionManager.java:1049) ~[pinot-all-1.2.0-SNAPSHOT-jar-with-dependencies.jar:1.2.0-SNAPSHOT-87f0c8089fd35eadcbbf29249a8993a47969d594] at org.apache.pinot.controller.helix.core.realtime.SegmentCompletionManager$SegmentCompletionFSM.segmentCommitEnd(SegmentCompletionManager.java:627) ~[pinot-all-1.2.0-SNAPSHOT-jar-with-dependencies.jar:1.2.0-SNAPSHOT-87f0c8089fd35eadcbbf29249a8993a47969d594] at org.apache.pinot.controller.helix.core.realtime.SegmentCompletionManager.segmentCommitEnd(SegmentCompletionManager.java:295) ~[pinot-all-1.2.0-SNAPSHOT-jar-with-dependencies.jar:1.2.0-SNAPSHOT-87f0c8089fd35eadcbbf29249a8993a47969d594] at org.apache.pinot.controller.api.resources.LLCSegmentCompletionHandlers.segmentCommitEndWithMetadata(LLCSegmentCompletionHandlers.java:396) ~[pinot-all-1.2.0-SNAPSHOT-jar-with-dependencies.jar:1.2.0-SNAPSHOT-87f0c8089fd35eadcbbf29249a8993a47969d594] at jdk.internal.reflect.GeneratedMethodAccessor182.invoke(Unknown Source) ~[?:?] at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?] at java.base/java.lang.reflect.Method.invoke(Method.java:566) ~[?:?] at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory.lambda$static$0(ResourceMethodInvocationHandlerFactory.java:52) ~[pinot-all-1.2.0-SNAPSHOT-jar-with-dependencies.jar:1.2.0-SNAPSHOT-87f0c8089fd35eadcbbf29249a8993a47969d594] at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:146) [pinot-all-1.2.0-SNAPSHOT-jar-with-dependencies.jar:1.2.0-SNAPSHOT-87f0c8089fd35eadcbbf29249a8993a47969d594] at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:189) [pinot-all-1.2.0-SNAPSHOT-jar-with-dependencies.jar:1.2.0-SNAPSHOT-87f0c8089fd35eadcbbf29249a8993a47969d594] at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$TypeOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:219) [pinot-all-1.2.0-SNAPSHOT-jar-with-dependencies.jar:1.2.0-SNAPSHOT-87f0c8089fd35eadcbbf29249a8993a47969d594] at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:93) [pinot-all-1.2.0-SNAPSHOT-jar-with-dependencies.jar:1.2.0-SNAPSHOT-87f0c8089fd35eadcbbf29249a8993a47969d594] at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:478) [pinot-all-1.2.0-SNAPSHOT-jar-with-dependencies.jar:1.2.0-SNAPSHOT-87f0c8089fd35eadcbbf29249a8993a47969d594] at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:400) [pinot-all-1.2.0-SNAPSHOT-jar-with-dependencies.jar:1.2.0-SNAPSHOT-87f0c8089fd35eadcbbf29249a8993a47969d594] at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:81) [pinot-all-1.2.0-SNAPSHOT-jar-with-dependencies.jar:1.2.0-SNAPSHOT-87f0c8089fd35eadcbbf29249a8993a47969d594] at org.glassfish.jersey.server.ServerRuntime$1.run(ServerRuntime.java:256) [pinot-all-1.2.0-SNAPSHOT-jar-with-dependencies.jar:1.2.0-SNAPSHOT-87f0c8089fd35eadcbbf29249a8993a47969d594] at org.glassfish.jersey.internal.Errors$1.call(Errors.java:248) [pinot-all-1.2.0-SNAPSHOT-jar-with-dependencies.jar:1.2.0-SNAPSHOT-87f0c8089fd35eadcbbf29249a8993a47969d594] at org.glassfish.jersey.internal.Errors$1.call(Errors.java:244) [pinot-all-1.2.0-SNAPSHOT-jar-with-dependencies.jar:1.2.0-SNAPSHOT-87f0c8089fd35eadcbbf29249a8993a47969d594] at org.glassfish.jersey.internal.Errors.process(Errors.java:292) [pinot-all-1.2.0-SNAPSHOT-jar-with-dependencies.jar:1.2.0-SNAPSHOT-87f0c8089fd35eadcbbf29249a8993a47969d594] at org.glassfish.jersey.internal.Errors.process(Errors.java:274) [pinot-all-1.2.0-SNAPSHOT-jar-with-dependencies.jar:1.2.0-SNAPSHOT-87f0c8089fd35eadcbbf29249a8993a47969d594] at org.glassfish.jersey.internal.Errors.process(Errors.java:244) [pinot-all-1.2.0-SNAPSHOT-jar-with-dependencies.jar:1.2.0-SNAPSHOT-87f0c8089fd35eadcbbf29249a8993a47969d594] at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:265) [pinot-all-1.2.0-SNAPSHOT-jar-with-dependencies.jar:1.2.0-SNAPSHOT-87f0c8089fd35eadcbbf29249a8993a47969d594] at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:235) [pinot-all-1.2.0-SNAPSHOT-jar-with-dependencies.jar:1.2.0-SNAPSHOT-87f0c8089fd35eadcbbf29249a8993a47969d594] at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:684) [pinot-all-1.2.0-SNAPSHOT-jar-with-dependencies.jar:1.2.0-SNAPSHOT-87f0c8089fd35eadcbbf29249a8993a47969d594] at org.glassfish.jersey.grizzly2.httpserver.GrizzlyHttpContainer.service(GrizzlyHttpContainer.java:356) [pinot-all-1.2.0-SNAPSHOT-jar-with-dependencies.jar:1.2.0-SNAPSHOT-87f0c8089fd35eadcbbf29249a8993a47969d594] at org.glassfish.grizzly.http.server.HttpHandler$1.run(HttpHandler.java:200) [pinot-all-1.2.0-SNAPSHOT-jar-with-dependencies.jar:1.2.0-SNAPSHOT-87f0c8089fd35eadcbbf29249a8993a47969d594] at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:569) [pinot-all-1.2.0-SNAPSHOT-jar-with-dependencies.jar:1.2.0-SNAPSHOT-87f0c8089fd35eadcbbf29249a8993a47969d594] at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.run(AbstractThreadPool.java:549) [pinot-all-1.2.0-SNAPSHOT-jar-with-dependencies.jar:1.2.0-SNAPSHOT-87f0c8089fd35eadcbbf29249a8993a47969d594] at java.base/java.lang.Thread.run(Thread.java:829) [?:?] Removing FSM (if present):{kpi_events300__20240628T0911Z,ABORTED,1719575849995,Server_pinot-server-0.pinot-server-headless.pinot-config-alerts.svc.cluster.local_8098,419289205,http://pinot-controller-0.pinot-controller-headless.pinot-config-alerts.svc.cluster.local:9000}

vineethvp commented 2 days ago

After making some config changes, below is the error i'm getting. The segments are being uploaded to s3 deepstore from server. But at the controller level commit is failing.

@Jackie-Jiang Is there a reason why pinot controller is trying to download segments from s3 after realtime.segment.serverUploadToDeepStore is enabled?

Below is my server config

configs: |- pinot.set.instance.id.to.hostname=true pinot.server.instance.realtime.alloc.offheap=true pinot.server.instance.enable.split.commit=true realtime.segment.serverUploadToDeepStore=true pinot.server.segment.fetcher.auth.token= pinot.server.segment.uploader.auth.token= pinot.server.instance.auth.token= pinot.server.storage.factory.s3.disableAcl=false pinot.server.storage.factory.class.s3=org.apache.pinot.plugin.filesystem.S3PinotFS pinot.server.instance.segment.store.uri = s3://ags-thd-prod-us1-config-alerts-ds/pinot-data/pinot-config-alerts/controller-data pinot.server.storage.factory.s3.region=us-east-1 pinot.server.segment.fetcher.protocols=file,http,s3 pinot.server.storage.factory.s3.httpclient.maxConnections=1000 pinot.server.storage.factory.s3.httpclient.socketTimeout=30s pinot.server.storage.factory.s3.httpclient.connectionTimeout=2s pinot.server.storage.factory.s3.httpclient.connectionAcquisitionTimeout=10s pinot.server.segment.fetcher.s3.class=org.apache.pinot.common.utils.fetcher.PinotFSSegmentFetcher pinot.query.scheduler.accounting.factory.name=org.apache.pinot.core.accounting.PerQueryCPUMemAccountantFactory pinot.query.scheduler.accounting.enable.thread.memory.sampling=true pinot.query.scheduler.accounting.enable.thread.cpu.sampling=true pinot.query.scheduler.accounting.oom.enable.killing.query=true pinot.query.scheduler.accounting.publishing.jvm.heap.usage=true

Controller config

pinot.set.instance.id.to.hostname=true controller.task.scheduler.enabled=true controller.disable.ingestion.groovy=false controller.segment.fetcher.auth.token= controller.local.temp.dir=/var/pinot/controller/data controller.enable.split.commit=true controller.realtime.segment.deepStoreUploadRetryEnabled=true pinot.controller.storage.factory.s3.disableAcl=false pinot.controller.storage.factory.class.s3=org.apache.pinot.plugin.filesystem.S3PinotFS pinot.controller.storage.factory.s3.region=us-east-1 pinot.controller.storage.factory.s3.httpclient.maxConnections=100 pinot.controller.storage.factory.s3.httpclient.socketTimeout=30s pinot.controller.storage.factory.s3.httpclient.connectionTimeout=2s pinot.controller.storage.factory.s3.httpclient.connectionAcquisitionTimeout=10s pinot.controller.segment.fetcher.protocols=file,http,s3 pinot.controller.segment.fetcher.s3.class=org.apache.pinot.common.utils.fetcher.PinotFSSegmentFetcher controller.allow.hlc.tables=false pinot.multistage.engine.enabled=true pinot.query.scheduler.accounting.factory.name=org.apache.pinot.core.accounting.PerQueryCPUMemAccountantFactory pinot.query.scheduler.accounting.enable.thread.memory.sampling=true pinot.query.scheduler.accounting.enable.thread.cpu.sampling=true pinot.query.scheduler.accounting.oom.enable.killing.query=true pinot.query.scheduler.accounting.publishing.jvm.heap.usage=true

below is the the error log i'm getting from the controller.

Committing segment kpi_events_s55020240630T0312Z at offset 405198329 winner Server_pinot-server-0.pinot-server-headless.pinot-config-alerts.svc.cluster.local_8098 Committing segment file for segment: kpi_events_s55020240630T0312Z Source s3://ags-thd-prod-us1-config-alerts-ds/pinot-data/pinot-config-alerts/controller-data/kpi_events_s5/kpi_events_s550__20240630T0312Z.tmp.a164edb5-3531-4fb1-aa87-1fc8d22a03ed does not exist Caught exception while committing segment file for segment: kpi_events_s55020240630T0312Z java.lang.IllegalStateException: Failed to move segment file for segment: kpi_events_s55020240630T0312Z from: s3://ags-thd-prod-us1-config-alerts-ds/pinot-data/pinot-config-alerts/controller-data/kpi_events_s5/kpi_events_s55020240630T0312Z.tmp.a164edb5-3531-4fb1-aa87-1fc8d22a03ed to: file:/var/pinot/controller/data/kpi_events_s5/kpi_events_s55020240630T0312Z at org.apache.pinot.shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:857) ~[pinot-all-1.2.0-SNAPSHOT-jar-with-dependencies.jar:1.2.0-SNAPSHOT-7062133323f573a3452d3cb4baaee40bdb408e47] at org.apache.pinot.controller.helix.core.realtime.PinotLLCRealtimeSegmentManager.moveSegmentFile(PinotLLCRealtimeSegmentManager.java:1749) ~[pinot-all-1.2.0-SNAPSHOT-jar-with-dependencies.jar:1.2.0-SNAPSHOT-7062133323f573a3452d3cb4baaee40bdb408e47] at org.apache.pinot.controller.helix.core.realtime.PinotLLCRealtimeSegmentManager.commitSegmentFile(PinotLLCRealtimeSegmentManager.java:476) ~[pinot-all-1.2.0-SNAPSHOT-jar-with-dependencies.jar:1.2.0-SNAPSHOT-7062133323f573a3452d3cb4baaee40bdb408e47] at org.apache.pinot.controller.helix.core.realtime.SegmentCompletionManager$SegmentCompletionFSM.commitSegment(SegmentCompletionManager.java:1049) ~[pinot-all-1.2.0-SNAPSHOT-jar-with-dependencies.jar:1.2.0-SNAPSHOT-7062133323f573a3452d3cb4baaee40bdb408e47] at org.apache.pinot.controller.helix.core.realtime.SegmentCompletionManager$SegmentCompletionFSM.segmentCommitEnd(SegmentCompletionManager.java:627) ~[pinot-all-1.2.0-SNAPSHOT-jar-with-dependencies.jar:1.2.0-SNAPSHOT-7062133323f573a3452d3cb4baaee40bdb408e47] at org.apache.pinot.controller.helix.core.realtime.SegmentCompletionManager.segmentCommitEnd(SegmentCompletionManager.java:295) ~[pinot-all-1.2.0-SNAPSHOT-jar-with-dependencies.jar:1.2.0-SNAPSHOT-7062133323f573a3452d3cb4baaee40bdb408e47] at org.apache.pinot.controller.api.resources.LLCSegmentCompletionHandlers.segmentCommitEndWithMetadata(LLCSegmentCompletionHandlers.java:396) ~[pinot-all-1.2.0-SNAPSHOT-jar-with-dependencies.jar:1.2.0-SNAPSHOT-7062133323f573a3452d3cb4baaee40bdb408e47] at jdk.internal.reflect.GeneratedMethodAccessor178.invoke(Unknown Source) ~[?:?] at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?] at java.base/java.lang.reflect.Method.invoke(Method.java:566) ~[?:?] at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory.lambda$static$0(ResourceMethodInvocationHandlerFactory.java:52) ~[pinot-all-1.2.0-SNAPSHOT-jar-with-dependencies.jar:1.2.0-SNAPSHOT-7062133323f573a3452d3cb4baaee40bdb408e47] at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:146) [pinot-all-1.2.0-SNAPSHOT-jar-with-dependencies.jar:1.2.0-SNAPSHOT-7062133323f573a3452d3cb4baaee40bdb408e47] at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:189) [pinot-all-1.2.0-SNAPSHOT-jar-with-dependencies.jar:1.2.0-SNAPSHOT-7062133323f573a3452d3cb4baaee40bdb408e47] at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$TypeOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:219) [pinot-all-1.2.0-SNAPSHOT-jar-with-dependencies.jar:1.2.0-SNAPSHOT-7062133323f573a3452d3cb4baaee40bdb408e47] at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:93) [pinot-all-1.2.0-SNAPSHOT-jar-with-dependencies.jar:1.2.0-SNAPSHOT-7062133323f573a3452d3cb4baaee40bdb408e47]