StarRocks / starrocks

StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries.
https://starrocks.io
Apache License 2.0
8.74k stars 1.75k forks source link

docker restarts but starrocks unreachable #31840

Closed alanpaulkwan closed 1 year ago

alanpaulkwan commented 1 year ago

Using the all-in-one docker, I ran StarRocks happily for 5 weeks. When I restarted my docker, StarRocks was not reachable. I'm not sure it always happens when I restart, but this time it did.

Diagnosing the problem with Allen Li over Slack, it seems like curl 127.0.0.1:8030/api/health returns connection refused.

What seems relevant, after looking through various log files, is that fe.out says that process is running, so I kill the process, run the deploy script, and and restart container. This seems to fix it.

kevincai commented 1 year ago

allin1 docker image doesn't check the /api/health for liveness, are you doing some customization?

Also what's the fe.log say?

alanpaulkwan commented 1 year ago

I did this all manually per Allen's suggestions. The log was like

2023-09-26 08:56:57,198 INFO (ReportHandler|150) [ReportHandler.tabletReport():365] backend[10004] reports 5700 tablet(s). report version: 16925719161324 2023-09-26 08:56:57,214 INFO (ReportHandler|150) [TabletInvertedIndex.tabletReport():300] finished to do tablet diff with backend[10004]. sync: 0. metaDel: 0. foundValid: 5700. foundInvalid: 0. migration: 0. found invalid transactions 0. found republish transactions 0 cost: 6 ms 2023-09-26 08:56:58,307 WARN (star rocks repository|42) [TableMetaSyncer.syncTable():57] call fe TNetworkAddress(hostname:127.0.0.1, port:9021) refreshTable rpc method failed org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused (Connection refused) at org.apache.thrift.transport.TSocket.open(TSocket.java:226) ~[libthrift-0.13.0.jar:0.13.0] at com.starrocks.common.GenericPool$ThriftClientFactory.create(GenericPool.java:144) ~[starrocks-fe.jar:?] at com.starrocks.common.GenericPool$ThriftClientFactory.create(GenericPool.java:129) ~[starrocks-fe.jar:?] at org.apache.commons.pool2.BaseKeyedPooledObjectFactory.makeObject(BaseKeyedPooledObjectFactory.java:62) ~[commons-pool2-2.3.jar:2.3] at org.apache.commons.pool2.impl.GenericKeyedObjectPool.create(GenericKeyedObjectPool.java:1036) ~[commons-pool2-2.3.jar:2.3] at org.apache.commons.pool2.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:356) ~[commons-pool2-2.3.jar:2.3] at org.apache.commons.pool2.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:278) ~[commons-pool2-2.3.jar:2.3] at com.starrocks.common.GenericPool.borrowObject(GenericPool.java:105) ~[starrocks-fe.jar:?] at com.starrocks.rpc.FrontendServiceProxy.call(FrontendServiceProxy.java:33) ~[starrocks-fe.jar:?] at com.starrocks.external.starrocks.TableMetaSyncer.syncTable(TableMetaSyncer.java:41) ~[starrocks-fe.jar:?] at com.starrocks.external.starrocks.StarRocksRepository.runAfterCatalogReady(StarRocksRepository.java:87) ~[starrocks-fe.jar:?] at com.starrocks.common.util.FrontendDaemon.runOneCycle(FrontendDaemon.java:72) ~[starrocks-fe.jar:?] at com.starrocks.external.starrocks.StarRocksRepository.runOneCycle(StarRocksRepository.java:65) ~[starrocks-fe.jar:?] at com.starrocks.common.util.Daemon.run(Daemon.java:115) ~[starrocks-fe.jar:?] Caused by: java.net.ConnectException: Connection refused (Connection refused) at java.net.PlainSocketImpl.socketConnect(Native Method) ~[?:?] at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:412) ~[?:?] at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:255) ~[?:?] at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:237) ~[?:?] at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[?:?] at java.net.Socket.connect(Socket.java:609) ~[?:?] at org.apache.thrift.transport.TSocket.open(TSocket.java:221) ~[libthrift-0.13.0.jar:0.13.0] ... 13 more 2023-09-26 08:56:58,499 INFO (colocate group clone checker|99) [ColocateTableBalancer.matchGroups():891] finished to match colocate group. cost: 0 ms, in lock time: 0 ms 2023-09-26 08:57:08,308 WARN (star rocks repository|42) [TableMetaSyncer.syncTable():57] call fe TNetworkAddress(hostname:127.0.0.1, port:9021) refreshTable rpc method failed org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused (Connection refused) at org.apache.thrift.transport.TSocket.open(TSocket.java:226) ~[libthrift-0.13.0.jar:0.13.0] at com.starrocks.common.GenericPool$ThriftClientFactory.create(GenericPool.java:144) ~[starrocks-fe.jar:?] at com.starrocks.common.GenericPool$ThriftClientFactory.create(GenericPool.java:129) ~[starrocks-fe.jar:?] at org.apache.commons.pool2.BaseKeyedPooledObjectFactory.makeObject(BaseKeyedPooledObjectFactory.java:62) ~[commons-pool2-2.3.^C root@myservice:/data/deploy/starrocks# cat fe/log/fe.log | tail -n 20 at org.apache.commons.pool2.BaseKeyedPooledObjectFactory.makeObject(BaseKeyedPooledObjectFactory.java:62) ~[commons-pool2-2.3.jar:2.3] at org.apache.commons.pool2.impl.GenericKeyedObjectPool.create(GenericKeyedObjectPool.java:1036) ~[commons-pool2-2.3.jar:2.3] at org.apache.commons.pool2.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:356) ~[commons-pool2-2.3.jar:2.3] at org.apache.commons.pool2.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:278) ~[commons-pool2-2.3.jar:2.3] at com.starrocks.common.GenericPool.borrowObject(GenericPool.java:105) ~[starrocks-fe.jar:?] at com.starrocks.rpc.FrontendServiceProxy.call(FrontendServiceProxy.java:33) ~[starrocks-fe.jar:?] at com.starrocks.external.starrocks.TableMetaSyncer.syncTable(TableMetaSyncer.java:41) ~[starrocks-fe.jar:?] at com.starrocks.external.starrocks.StarRocksRepository.runAfterCatalogReady(StarRocksRepository.java:87) ~[starrocks-fe.jar:?] at com.starrocks.common.util.FrontendDaemon.runOneCycle(FrontendDaemon.java:72) ~[starrocks-fe.jar:?] at com.starrocks.external.starrocks.StarRocksRepository.runOneCycle(StarRocksRepository.java:65) ~[starrocks-fe.jar:?] at com.starrocks.common.util.Daemon.run(Daemon.java:115) ~[starrocks-fe.jar:?] Caused by: java.net.ConnectException: Connection refused (Connection refused) at java.net.PlainSocketImpl.socketConnect(Native Method) ~[?:?] at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:412) ~[?:?] at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:255) ~[?:?] at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:237) ~[?:?] at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[?:?] at java.net.Socket.connect(Socket.java:609) ~[?:?] at org.apache.thrift.transport.TSocket.open(TSocket.java:221) ~[libthrift-0.13.0.jar:0.13.0] ... 13 more

kevincai commented 1 year ago

saw the context in slack channel.