Open opschronicle opened 3 years ago
my-pinot-server-0 server [Times: user=0.08 sys=0.00, real=0.02 secs]
my-pinot-controller-0 controller 2021/01/26 22:17:09.409 WARN [CallbackHandler] [ZkClient-EventThread-29-my-pinot-zk-cp-zookeeper.logging.svc.cluster.local:2181] Callback handler received event in wrong order. Listener: org.apache.helix.controller.GenericHelixController@362617cf, path: /pinot-quickstart/INSTANCES/Server_my-pinot-server-1.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES, expected types: [INIT] but was CALLBACK
my-pinot-server-1 server 2021/01/26 22:17:09.588 WARN [ZkBaseDataAccessor] [ZkClient-EventThread-18-my-pinot-zk-cp-zookeeper.logging.svc.cluster.local:2181] Fail to read record for paths: {/pinot-quickstart/INSTANCES/Server_my-pinot-server-1.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/7fdbc2b8-d50c-43e0-8aae-8f81149ff9f6=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-1.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/ebfe4780-7d65-42a0-9293-8de8be680b52=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-1.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/5cae61b1-c190-4ba4-8d75-2f55fbef3204=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-1.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/b3a4690d-8f0b-49f7-a950-6fd56746760c=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-1.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/0febb043-2ddb-44b6-8637-ac9ec8e991c4=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-1.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/08aa0bbe-ddbc-4a40-8e4b-a94faee38d4c=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-1.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/4c94c196-7ed7-46cc-9b48-a7d3928df6c5=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-1.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/f8d49a81-2834-4a12-9908-e8f52f6a8b9b=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-1.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/2eb95020-540c-46e7-b18b-56232d5079ce=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-1.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/ddc77c89-593b-4278-954f-6d0aad4bba4a=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-1.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/09a788d4-f94b-4172-a7b5-ed6df22523b7=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-1.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/b12423f6-22ba-46cb-a23a-84f20dbb7bb8=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-1.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/b8cb57cc-db6c-414a-a97e-b4fd2147ce14=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-1.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/c70e5415-1520-485d-9eb7-2d43940422a3=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-1.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/352ead98-9e5d-4572-b4a9-224c7dff6a3d=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-1.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/b0405375-5316-41ad-b134-ed55d2b54407=-101}
my-pinot-controller-0 controller 2021/01/26 22:17:09.600 WARN [CallbackHandler] [ZkClient-EventThread-29-my-pinot-zk-cp-zookeeper.logging.svc.cluster.local:2181] Callback handler received event in wrong order. Listener: org.apache.helix.controller.GenericHelixController@362617cf, path: /pinot-quickstart/INSTANCES/Server_my-pinot-server-1.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES, expected types: [INIT] but was CALLBACK
my-pinot-server-0 server 2021-01-26T22:17:09.358+0000: 113.899: Total time for which application threads were
my-pinot-controller-0 controller 2021/01/26 22:17:12.039 WARN [ZkBaseDataAccessor] [HelixController-pipeline-task-pinot-quickstart-(83ee1db3_TASK)] Fail to read record for paths: {/pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/51a431e9-c0e1-4e08-9bcb-aee747608526=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/f9005198-fe85-4cad-84e8-0df918de95d9=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/ef3e0a42-eee4-43a4-bb11-bd1878937e2c=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/2fd3aeea-91d8-446d-b87f-eeb3d99e24c9=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/03239839-70d8-49ed-84eb-6d9d4bc3b777=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/d7c99ffe-7a4d-433b-a265-cf8c9fba6a5f=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/d299feba-1781-407e-a7b3-83287fbd0997=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/68902832-2e61-4e6a-b249-e0e74711d493=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/084390ed-cccd-4171-b4c1-45c2f43ddc1a=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/53c3065c-60e3-4355-ab33-9f32221c5407=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/46f7ae44-ee3d-45c0-8781-6cedc67ea518=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/ba4e44c4-15d1-46a7-ac99-3f4fd11db97f=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/dd3e8c08-e914-4245-8857-3d1229573c35=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/95ad2084-572f-4651-b009-1f3eaab73e6b=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/cc5bea58-9618-4746-9a6c-a6b8e5afdfaa=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/3491cf67-e305-4767-9d55-0963f9ffdf00=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/c7dca37a-e4d6-4fbf-a244-ec5b92724c9e=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/b224f8f5-ea43-419a-a3af-11dace5e3f16=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/ae7c5ede-f7ec-45bb-a095-97210c6a7932=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/15a7fbcc-48b5-4300-9338-a0b009151e9e=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/f46c6152-858f-44f6-8658-f7c7540915f5=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/7a6a008a-33a2-4b08-9aac-db2fd0c00396=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/7819229e-21af-4760-9ce3-c38e446c1e14=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/5dcd8621-f658-4c49-be47-41aea30e9cb4=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/92519c36-bcfb-4501-8dbb-0702ebdfd708=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/5fa1b752-39e8-4563-847e-4eaef3c3d5b0=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/74263ff4-734d-4529-966c-f0cb0927593f=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/98bb3b99-80d8-40bb-bdb6-d34f0b5e1090=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/39114365-1976-42ab-882d-e729eb70c143=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/86c29e7a-c24d-46f1-9efb-865b02b8999a=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/bc02072d-68db-4264-acb6-e6ded758a74a=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/aec1490b-4a0f-4d66-aad7-f2801b172b65=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/d894e3b1-e597-484f-a748-96a470c93f93=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/1b1ec9f0-c17f-44f7-b6a1-26c0c141892f=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/8f0d8d0a-ec3a-4b11-bbf8-f44e0b611d3f=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/230e18fe-135b-4892-ac34-ae7909874224=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/3d64a6bb-5e6c-4f72-87eb-ac295cb7c0e3=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/4e0b1fb2-f514-4350-92ac-2162c408f947=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/b110753b-49e3-42d8-8a92-f635247a5d32=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/3891485f-621c-4a62-a702-4de7b9cda0ca=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/b221af1a-3704-4290-a980-4e6fac076d08=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/5b970ff8-8f17-4518-9ffc-7ff72a0bc52b=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/7401939f-83a1-433b-9a4b-801e8b110b49=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/16a027cf-5c09-4b5f-9030-8bc8f0afcded=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/1ef817b6-21cc-48ae-bd16-ed2ad2e80b6d=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/8e9f37bf-25e0-4086-8401-df0f73850a6b=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/88f453a1-eca9-4b2c-9e4f-87e88da6e4ed=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/d0d42bf9-3b12-4686-8c7d-cd7bde6284cc=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/4f2526dc-cdf6-4035-b1b6-bef9f8871583=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/2b020fac-6c46-4eb2-9802-1cb7bd7c586b=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/11b63a38-99d9-442c-9eca-56f7d427c9a1=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/372cec23-e89b-421d-b611-b9bd35ed6bb3=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/5a0aa3ae-d134-47e1-8fae-53aa0d70ee99=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/aec9d3ad-f1e9-4429-b031-1207534e9dff=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/fd9592c2-22e9-4536-a8ef-4d4e1eab6d5f=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/0c36d8f7-7905-48a7-befe-7f48693f4c17=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/0e31dca5-4234-4842-a5ac-b44ccd01d1cb=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/d998d2e5-1796-42be-8848-9154d0d0be2f=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/3e201c8c-f8cb-431c-9e10-ab7da5a03eef=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/7bb908de-8c97-4cc7-8499-440d2e7ce19c=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/06af6769-999b-4aa9-b1ec-4d3f063eb998=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/84eb9eb3-3ff4-4c47-9bb4-daa685ff7290=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/5f3a394a-0f9e-410b-ab9f-fd6989052588=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/42c83d12-35b8-4999-ace6-c51a9c15f5da=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/53e12a60-c891-4182-a8c6-a04cac8f7e9c=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/39ef8da1-1fd3-4de7-a150-1962ccda4194=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/39cb56e1-2dca-4c72-9bb6-ad98e83bd6fe=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/88580d08-3d66-4c48-8678-8d2879687ddc=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/5b1e6f1e-51d5-438d-a207-3af761fe960c=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/ad69603f-3ae7-43b0-9449-2f732c8acdfb=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/5514c26b-2d5e-47a3-839f-3c5b98303bc9=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/25c2728e-73eb-4ebe-b2d4-e61708348652=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/0b22e1da-632b-430c-be22-cc0c46ad0a75=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/dfd7e352-76e0-4a24-9294-84b72106d709=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/bea0dc31-8f14-4c54-8097-64f11b0a0f22=-101, /pinot-quickstart/INSTANCES/Server_my-pinot-server-0.my-pinot-server-headless.logging.svc.cluster.local_8098/MESSAGES/56805ab0-d620-401f-a6de-d2890709151d=-101}
my-pinot-server-0 server 2021-01-26T22:17:11.909+0000: 116.450: Total time for which application threads were stopped: 0.0134050 seconds, Stopping threads took: 0.0001936 seconds
what changed?
@kishoreg Nothing Changed it was running for 3 days without a hitch and suddenly keep crashing. I changed the memory from -Xms512M -Xmx4G to -Xms12G -Xmx12G and servers are up . Not sure what happened here..!! Also cannot get hold of coredmp or threaddmp as it is container. I may have to map /opt/pinot to some pvc.
Anyways all the segments are in corrupted status..!!! It seems I have to restore the server from backup again..!!!
I reloaded the segments and the server is back online . Is there any way to speed up the catch up process from the kafka stream?
add more servers and once they catch up, shrink the cluster and rebalance..
Thanks @kishoreg , The server did not recover. it went to bad state again and Pinot server crashed. I had to do the hard way delete and recreate. What are the possibilities of server crash in Pino and what steps normally we can take to avoid this?
any logs that you can share.
@kishoreg , it crashed again and Pino server stopped consuming messaging and query started to give incorrect results. The errors I can see are below
my-pinot-broker-0 broker 2021/01/28 14:15:07.151 ERROR [QueryRouter] [jersey-server-managed-async-executor-3] Caught exception while sending request 549 to server: my-pinot-server-0_R, marking query failed
my-pinot-broker-0 broker 2021/01/28 14:15:10.917 ERROR [QueryRouter] [jersey-server-managed-async-executor-3] Caught exception while sending request 550 to server: my-pinot-server-0_R, marking query failed
my-pinot-broker-0 broker 2021-01-28T14:15:10.918+0000: 64927.313: Total time for which application threads were stopped: 0.0000965 seconds, Stopping threads too2021/01/28 14:15:11.008 ERROR [QueryRouter] [jersey-server-managed-async-executor-3] Caught exception while sending request 551 to server: my-pinot-server-0_R, marking query failed
my-pinot-broker-0 broker 2021/01/28 14:15:12.104 ERROR [DataTableHandler] [nioEventLoopGroup-2-6] Caught exception while handling response from server: my-pinot-server-1_R
my-pinot-broker-0 broker 2021/01/28 14:15:12.106 ERROR [DataTableHandler] [nioEventLoopGroup-2-6] Channel for server: my-pinot-server-1_R is now inactive, marking server down
my-pinot-broker-0 broker 2021/01/28 14:20:30.912 ERROR [DataTableHandler] [nioEventLoopGroup-2-4] Channel for server: my-pinot-server-0_R is now inactive, marking server down
Stopping threads took: 0.0000264 seconds
2021-01-28T14:29:13.291+0000: 2021/01/28 14:29:13.301 WARN [SegmentColumnarIndexCreator] [Thread-7] Caught exception java.io.IOException: No space left on device while refreshing realtime lucene reader for segment: sblog__0__515__20210128T1354Z
sb-pinot-controller-0sb-pinot-server-0 server 21.588: Total time for which application threads were stopped: 0.0003119 seconds, Stopping threads took: 0.0001239 seconds
@kishoreg what could be the possible reasons the space is getting filled when I try to catch up messages from the stream? This will only happen when I recreate the table and try to catch up messages.
Storage will be filled up naturally as the servers consume more records, and flush them to the disk. Can you please check the disk space left in your data directory?
@Jackie-Jiang Thanks it was 100% . I deleted some data from stream . But not sure why a full stream catch up takes more disk space that normal realtime consumption. From my understanding data size for three days retention on kafka should remain same whether we consume in Pinot at one shot or consume in three long days.
While consuming, Pinot allocates some mmap'ed files as the off-heap memory buffer. If the consumption rate is very high, the file size could be larger because we will have more records in one segment. I would recommend giving more room for the storage, e.g. keep the usage under 50%
Thanks for the suggestion @Jackie-Jiang , I will allocate more space. Also wondering whether it is possible to set the segment time based on the time field from kafka? So that segments will expire based on kafka time rather than Pinot segment created time. My retention is 3 days , the issue happens when I have to catch up entire 3 days messages at once and all the caught up segment dates will have same date and will only expire after 3 days and they consume lot of disk space unnecessarily.
@pabrahamusa Pinot is using the time column (should be the time field in kafka as well) value to expire the segment, instead of the segment creation time. The segment will still be created, but should be removed once the latest time value is older than the retention.
@Jackie-Jiang Thanks I will verify this and confirm .
The problem for me was not having enough space in my ebs volume.
I am facing an issue where both Pinot servers are not restarting and they are keep on crashing with following error. I am using latest image. Any one came across this?